如何创建一个新的数据框列，并从另一个列中移出值？

Question

I am returning data from a database query and want to create a new column in the resulting dataframe. 我正在从数据库查询返回数据，并想在结果数据框中创建一个新列。 I need to shift the results of one column forward one month to create a new column. 我需要将一个专栏的结果向前移动一个月才能创建一个新专栏。

I have a dataframe that is populated from a sql query and has the format: 我有一个从SQL查询填充的数据框，其格式为：

df.dtypes
ACTIVITY_MONTH     datetime64[ns]
PRODUCT_KEY                object
COUNT                 float64

When I run: 当我跑步时：

df['NEW_COUNT'] = df.groupby('PRODUCT_KEY')['COUNT'].shift(+1)

I get this error: 我收到此错误：

ValueError: cannot reindex from a duplicate axis

This error doesn't make sense to me and I am not sure what to do to fix it. 这个错误对我来说没有任何意义，我不确定该如何解决。 Any help is appreciated. 任何帮助表示赞赏。

Answer 1

The error ValueError: cannot reindex from a duplicate axis indicates in this case that you have duplicate entries in your index (and for this reason, it cannot assign to a new column, as pandas cannot know where to place the values for the duplicate entries). 错误ValueError: cannot reindex from a duplicate axis重新编制索引表明在这种情况下，索引中有重复的条目（由于这个原因，它无法分配给新的列，因为熊猫无法知道将重复的条目的值放在何处）。

To check for duplicate values in the index, you can do: 要检查索引中的重复值，可以执行以下操作：

df.index.get_duplicates()

And then to get rid of the duplicate values (if you don't need to keep the original index), you can eg do df.reset_index(drop=True) , or you can use ignore_index=True in append or concat . 然后要摆脱重复的值（如果不需要保留原始索引），可以例如执行df.reset_index(drop=True) ，也可以在append或concat使用ignore_index=True 。

如何创建一个新的数据框列，并从另一个列中移出值？

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-06-05 23:31:47

如何创建一个新的数据框列，并从另一个列中移出值？

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-06-05 23:31:47

解决方案1
1 已采纳 2014-06-05 23:31:47