[英]How to create a new dataframe column with shifted values from another column?
I am returning data from a database query and want to create a new column in the resulting dataframe. 我正在从数据库查询返回数据,并想在结果数据框中创建一个新列。 I need to shift the results of one column forward one month to create a new column. 我需要将一个专栏的结果向前移动一个月才能创建一个新专栏。
I have a dataframe that is populated from a sql query and has the format: 我有一个从SQL查询填充的数据框,其格式为:
df.dtypes
ACTIVITY_MONTH datetime64[ns]
PRODUCT_KEY object
COUNT float64
When I run: 当我跑步时:
df['NEW_COUNT'] = df.groupby('PRODUCT_KEY')['COUNT'].shift(+1)
I get this error: 我收到此错误:
ValueError: cannot reindex from a duplicate axis
This error doesn't make sense to me and I am not sure what to do to fix it. 这个错误对我来说没有任何意义,我不确定该如何解决。 Any help is appreciated. 任何帮助表示赞赏。
The error ValueError: cannot reindex from a duplicate axis
indicates in this case that you have duplicate entries in your index (and for this reason, it cannot assign to a new column, as pandas cannot know where to place the values for the duplicate entries). 错误ValueError: cannot reindex from a duplicate axis
重新编制索引表明在这种情况下,索引中有重复的条目(由于这个原因,它无法分配给新的列,因为熊猫无法知道将重复的条目的值放在何处) 。
To check for duplicate values in the index, you can do: 要检查索引中的重复值,可以执行以下操作:
df.index.get_duplicates()
And then to get rid of the duplicate values (if you don't need to keep the original index), you can eg do df.reset_index(drop=True)
, or you can use ignore_index=True
in append
or concat
. 然后要摆脱重复的值(如果不需要保留原始索引),可以例如执行df.reset_index(drop=True)
,也可以在append
或concat
使用ignore_index=True
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.