繁体   English   中英

如何使用 pandas 根据来自不同列的多个值生成不同的列

[英]How do I generate different columns based on multiple values from different columns using pandas

我有以下数据集

data = {
'Partner': ['More', 'More', 'More', 'Reliance','Reliance','Reliance','Reliance','Reliance', 'More', 'More','Azfresh','Azfresh','Azfresh','Azfresh','Azfresh'],
'Brand': ['Biseliri','Biseliri','Biseliri','Biseliri','Biseliri','Biseliri','Kinili','Kinili','Kinili','Kinili','Biseliri','Biseliri','Biseliri','Kinili','Kinili'],
'Category': ['Milk','Milk','Milk','Milk','Milk','Milk','Water','Water','Water','Water','Water','Water','Water','Milk','Milk'],
'Product':['Milk_a','Milk_a','Milk_a','Milk_a','Milk_b','Milk_b','Water_a','Water_a','Water_b','Water_b','Water_a','Water_b','Water_a','Milk_b','Milk_b'],
'Yearweek':[202001,202003,202004,202001,202001,202002,202001,202002,202001,202002,202001,202001,202003,202001,202002],
'MRP':[50,45,50,50,45,45,100,90,150,150,110,150,100,50,50]}

我正在尝试按合作伙伴、品牌、类别、产品对数据进行分组,并获取产品 MRP 的减少/增加,并查看价格降低了多长时间。


     Brand  Category    MRP Partner     Product Yearweek
0   Biseliri    Milk    50  More        Milk_a  202001
1   Biseliri    Milk    45  More        Milk_a  202003
2   Biseliri    Milk    50  More        Milk_a  202004
3   Biseliri    Milk    50  Reliance    Milk_a  202001
4   Biseliri    Milk    45  Reliance    Milk_b  202001
5   Biseliri    Milk    45  Reliance    Milk_b  202002
6   Kinili      Water   100 Reliance    Water_a 202001
7   Kinili      Water   90  Reliance    Water_a 202002
8   Kinili      Water   150 More        Water_b 202001
9   Kinili      Water   150 More        Water_b 202002
10  Biseliri    Water   110 Azfresh     Water_a 202001
11  Biseliri    Water   150 Azfresh     Water_b 202001
12  Biseliri    Water   100 Azfresh     Water_a 202003
13  Kinili      Milk    50  Azfresh     Milk_b  202001
14  Kinili      Milk    50  Azfresh     Milk_b  202002

所以我尝试使用下面的代码进行分组

groupeddata = df.groupby(['Brand','Category','Partner','Product','Yearweek']).agg({'MRP':'min'}).reset_index()

使用最小 MRP 聚合,以防同一组数据有多个 MRP 发布此消息后,我使用此代码生成该组产品价格之间的差异,以查看价格的上涨或下跌。 但我不确定如何根据 Yearweek 来做。

groupeddata['diff'] = groupeddata['MRP'].shift(+1)-groupeddata['MRP']
groupeddata['diff'].fillna('0',inplace = True)
groupeddata['diff'] = groupeddata['diff'].apply(lambda x:int(x))
groupeddata['mrpoff'] = groupeddata['diff'].astype(str)+np.where(groupeddata.eval("diff>0"),"rs less"," rs increased")

但这会产生错误的df。

我正在努力实现这一点:如果价格差异保持超过 2 周,那么 noofdays 应该是 14,就像在第 1 行和第 2 行的情况一样 - MRP 仅在 1 周停留在 45 后才会增加。如果 MRP 停留202003 和 202004 为 45 并在未来增加,然后 noofdays 应为 2 周 * 7 天 - 14 天

    Brand   Category    MRP Partner     Product Yearweek    diff    noofdays
0   Biseliri    Milk    50  More        Milk_a  202001         0    0
1   Biseliri    Milk    45  More        Milk_a  202003         5    7
2   Biseliri    Milk    50  More        Milk_a  202004        -5    0
3   Biseliri    Milk    50  Reliance    Milk_a  202001         0    0
4   Biseliri    Milk    45  Reliance    Milk_b  202001         0    0
5   Biseliri    Milk    45  Reliance    Milk_b  202002         0    0
6   Kinili      Water   100 Reliance    Water_a 202001         0    0
7   Kinili      Water   90  Reliance    Water_a 202002        10    7
8   Kinili      Water   150 More        Water_b 202001         0    0
9   Kinili      Water   150 More        Water_b 202002         0    0
10  Biseliri    Water   110 Azfresh     Water_a 202001         0    0
11  Biseliri    Water   150 Azfresh     Water_b 202001         0    0
12  Biseliri    Water   100 Azfresh     Water_a 202003        10    7
13  Kinili      Milk    50  Azfresh     Milk_b  202001         0    0
14  Kinili      Milk    50  Azfresh     Milk_b  202002         0    0

请帮忙,谢谢!

我不太明白你在追求什么,但也许这是一个开始?

(df
 .assign(diff=lambda x: x.groupby(['Brand','Category','Partner','Product'])["MRP"].transform(lambda x: x.diff()))
 .fillna(0)
 .sort_values(['Brand','Category','Partner','Product', 'Yearweek'])
)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM