I am returning data from a database query and want to create a new column in the resulting dataframe. I need to shift the results of one column forward one month to create a new column.
I have a dataframe that is populated from a sql query and has the format:
df.dtypes
ACTIVITY_MONTH datetime64[ns]
PRODUCT_KEY object
COUNT float64
When I run:
df['NEW_COUNT'] = df.groupby('PRODUCT_KEY')['COUNT'].shift(+1)
I get this error:
ValueError: cannot reindex from a duplicate axis
This error doesn't make sense to me and I am not sure what to do to fix it. Any help is appreciated.
The error ValueError: cannot reindex from a duplicate axis
indicates in this case that you have duplicate entries in your index (and for this reason, it cannot assign to a new column, as pandas cannot know where to place the values for the duplicate entries).
To check for duplicate values in the index, you can do:
df.index.get_duplicates()
And then to get rid of the duplicate values (if you don't need to keep the original index), you can eg do df.reset_index(drop=True)
, or you can use ignore_index=True
in append
or concat
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.