[英]Reformatting a pandas dataframe
I have a pandas dataframe that looks like this. 我有一个像这样的熊猫数据框。
ITEM SKU PRICE
0 FOO OLD 120 45
1 FOO OLD 121 48
2 BAR OLD 122 51
3 BAR OLD 123 54
4 FOO NEW 120 60
5 FOO NEW 121 65
6 BAR NEW 122 70
7 BAR 123 75
8 BAR 124 80
Clarification: I can ensure that there is no ambiguity about the value in ITEM
and in fact will ensure that it is split into A and B correctly before the transformation. 澄清:我可以确保ITEM
的值没有歧义,实际上,可以确保在转换之前将其正确地分为A和B。
I want to transform it into this: 我想将其转换为:
ITEM SKU OLD NEW
0 FOO 120 45 60
1 FOO 121 50 65
2 BAR 122 55 70
3 BAR 123 60 75
4 BAR 124 NaN 80
I know I can split the old prices and new prices, rename columns, and even strip out the " NEW"
and " OLD"
from ITEM
. 我知道我可以拆分旧价格和新价格,重命名列,甚至从ITEM
删除" NEW"
和" OLD"
。 I have no clue what to do with it after doing that. 完成此操作后,我不知道该如何处理。
Further, I suspect that these steps are unnecessary because there is probably a better way to reshape this dataframe more cleanly. 此外,我怀疑这些步骤是不必要的,因为可能存在更好的方法来更清晰地重塑此数据框。
Use: 采用:
df[['A','B']] = df.pop('ITEM').str.split(expand=True)
df['B'] = df['B'].fillna('NEW')
df = df.set_index(['A','SKU','B'])['PRICE'].unstack().reset_index().rename_axis(None, axis=1)
print (df)
A SKU NEW OLD
0 BAR 122 70.0 51.0
1 BAR 123 75.0 54.0
2 BAR 124 80.0 NaN
3 FOO 120 60.0 45.0
4 FOO 121 65.0 48.0
If not working because duplicates: 如果由于重复而无法正常工作:
df = df.pivot_table(index=['A','SKU'], columns='B', values='PRICE').reset_index()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.