重新格式化熊猫数据框

Question

I have a pandas dataframe that looks like this. 我有一个像这样的熊猫数据框。

      ITEM  SKU  PRICE
0  FOO OLD  120     45
1  FOO OLD  121     48
2  BAR OLD  122     51
3  BAR OLD  123     54
4  FOO NEW  120     60
5  FOO NEW  121     65
6  BAR NEW  122     70
7  BAR      123     75
8  BAR      124     80

Clarification: I can ensure that there is no ambiguity about the value in ITEM and in fact will ensure that it is split into A and B correctly before the transformation. 澄清：我可以确保ITEM的值没有歧义，实际上，可以确保在转换之前将其正确地分为A和B。

I want to transform it into this: 我想将其转换为：

  ITEM  SKU  OLD  NEW
0  FOO  120   45   60
1  FOO  121   50   65
2  BAR  122   55   70
3  BAR  123   60   75
4  BAR  124  NaN   80

I know I can split the old prices and new prices, rename columns, and even strip out the " NEW" and " OLD" from ITEM . 我知道我可以拆分旧价格和新价格，重命名列，甚至从ITEM删除" NEW"和" OLD" 。 I have no clue what to do with it after doing that. 完成此操作后，我不知道该如何处理。

Further, I suspect that these steps are unnecessary because there is probably a better way to reshape this dataframe more cleanly. 此外，我怀疑这些步骤是不必要的，因为可能存在更好的方法来更清晰地重塑此数据框。

Answer 1

Use: 采用：

df[['A','B']] = df.pop('ITEM').str.split(expand=True)
df['B'] = df['B'].fillna('NEW')

df = df.set_index(['A','SKU','B'])['PRICE'].unstack().reset_index().rename_axis(None, axis=1)
print (df)
     A  SKU   NEW   OLD
0  BAR  122  70.0  51.0
1  BAR  123  75.0  54.0
2  BAR  124  80.0   NaN
3  FOO  120  60.0  45.0
4  FOO  121  65.0  48.0

If not working because duplicates: 如果由于重复而无法正常工作：

df = df.pivot_table(index=['A','SKU'], columns='B', values='PRICE').reset_index()

重新格式化熊猫数据框

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-03-03 14:52:39

重新格式化熊猫数据框

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-03-03 14:52:39

解决方案1
2 已采纳 2019-03-03 14:52:39