I have a dataframe shown as below:
PR Order Season Rj
0 3001913971 3445046069 202112 NaN
1 3002026058 1445132366 202121 NaN
2 3002026059 1445132365 202122 NaN
3 3002026063 1445132367 202211 NaN
4 3002026069 1445132375 202121 NaN
When i first run code below , it works fine
df['Season'] = df['Season'].astype(str)
df.loc[(df['Season'].str[-2:] == '11') & (df['Season'].str.len() == 6),'Season'] = 'Spring ' + df.loc[df['Season'].str[-2:] == '11','Season'].str[:4]
df.loc[(df['Season'].str[-2:] == '12') & (df['Season'].str.len() == 6),'Season'] = 'Summer ' + df.loc[df['Season'].str[-2:] == '12','Season'].str[:4]
df.loc[(df['Season'].str[-2:] == '21') & (df['Season'].str.len() == 6),'Season'] = 'Autumn ' + df.loc[df['Season'].str[-2:] == '21','Season'].str[:4]
df.loc[(df['Season'].str[-2:] == '22') & (df['Season'].str.len() == 6),'Season'] = 'Holiday ' + df.loc[df['Season'].str[-2:] == '22','Season'].str[:4]
The result of the 1st run is like this.
PR Order Season Rj
0 3001913971 3445046069 Summer 2021 NaN
1 3002026058 1445132366 Autumn 2021 NaN
2 3002026059 1445132365 Holiday 2021 NaN
3 3002026063 1445132367 Spring 2022 NaN
4 3002026069 1445132375 Autumn 2021 NaN
But when i run it 2nd times, it will raise error
ValueError: Must have equal len keys and value when setting with an iterable
Do you know why? many thanks
When you run the code a second time, none of the Season
strings has length 6 anymore (and none of them has 11
as the last two letters), so the second line of your code is supposed to assign the string 'Spring '
to an empty slice of the dataframe, which is of course impossible.
In general, when you extract data like this, it's often a good idea to keep the original column and add the derived values as additional columns. This avoids the problem above and may also help catching errors. Redundancy can be a good thing. By the way, you can also extract the data directly from the integer values, without converting them to strings first. Floor division and the modulo operator are all you need:
df['Year'] = df.Season // 100
df['Season_cat'] = (df.Season % 100).astype('category').cat.rename_categories(
{11: 'Spring', 12: 'Summer', 21: 'Autumn', 22: 'Holiday'})
df
PR Order Season Rj Year Season_cat
0 3001913971 3445046069 202112 NaN 2021 Summer
1 3002026058 1445132366 202121 NaN 2021 Autumn
2 3002026059 1445132365 202122 NaN 2021 Holiday
3 3002026063 1445132367 202211 NaN 2022 Spring
4 3002026069 1445132375 202121 NaN 2021 Autumn
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.