简体   繁体   中英

Python Pandas Dataframe ValueError: Must have equal len keys and value when setting with an iterable

I have a dataframe shown as below:

       PR         Order     Season  Rj
0   3001913971  3445046069  202112  NaN
1   3002026058  1445132366  202121  NaN
2   3002026059  1445132365  202122  NaN
3   3002026063  1445132367  202211  NaN
4   3002026069  1445132375  202121  NaN

When i first run code below , it works fine

df['Season'] = df['Season'].astype(str)
df.loc[(df['Season'].str[-2:] == '11') & (df['Season'].str.len() == 6),'Season'] = 'Spring ' + df.loc[df['Season'].str[-2:] == '11','Season'].str[:4]
df.loc[(df['Season'].str[-2:] == '12') & (df['Season'].str.len() == 6),'Season'] = 'Summer ' + df.loc[df['Season'].str[-2:] == '12','Season'].str[:4]
df.loc[(df['Season'].str[-2:] == '21') & (df['Season'].str.len() == 6),'Season'] = 'Autumn ' + df.loc[df['Season'].str[-2:] == '21','Season'].str[:4]
df.loc[(df['Season'].str[-2:] == '22') & (df['Season'].str.len() == 6),'Season'] = 'Holiday ' + df.loc[df['Season'].str[-2:] == '22','Season'].str[:4]

The result of the 1st run is like this.

      PR           Order    Season       Rj
0   3001913971  3445046069  Summer 2021  NaN
1   3002026058  1445132366  Autumn 2021  NaN
2   3002026059  1445132365  Holiday 2021 NaN
3   3002026063  1445132367  Spring 2022  NaN
4   3002026069  1445132375  Autumn 2021  NaN

But when i run it 2nd times, it will raise error

ValueError: Must have equal len keys and value when setting with an iterable

Do you know why? many thanks

When you run the code a second time, none of the Season strings has length 6 anymore (and none of them has 11 as the last two letters), so the second line of your code is supposed to assign the string 'Spring ' to an empty slice of the dataframe, which is of course impossible.

In general, when you extract data like this, it's often a good idea to keep the original column and add the derived values as additional columns. This avoids the problem above and may also help catching errors. Redundancy can be a good thing. By the way, you can also extract the data directly from the integer values, without converting them to strings first. Floor division and the modulo operator are all you need:

df['Year'] = df.Season // 100
df['Season_cat'] = (df.Season % 100).astype('category').cat.rename_categories(
    {11: 'Spring', 12: 'Summer', 21: 'Autumn', 22: 'Holiday'})

df
    PR          Order       Season  Rj      Year    Season_cat
0   3001913971  3445046069  202112  NaN     2021    Summer
1   3002026058  1445132366  202121  NaN     2021    Autumn
2   3002026059  1445132365  202122  NaN     2021    Holiday
3   3002026063  1445132367  202211  NaN     2022    Spring
4   3002026069  1445132375  202121  NaN     2021    Autumn

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM