简体   繁体   English

熊猫groupby与if条件

[英]pandas groupby with if condition

I have the below dataframe with invoice data. 我有以下带有发票数据的数据框。 I want to add the MainCode, based on the below logic. 我想根据以下逻辑添加MainCode。

1 - first groupby ticket_id & id . 1-首先groupby ticket_idid If the level is zero, then the MainCode should be zero. 如果level为零,则MainCode应该为零。 if not code of the level zero should taken. 如果不是,则应采用零level code

+-----------+----+-------+------+
| ticket_id | id | level | code |
+-----------+----+-------+------+
|         1 |  0 |     0 | 1710 |
|         1 |  0 |     1 |  372 |
|         1 |  0 |     2 |  607 |
|         1 |  1 |     0 | 1727 |
|         1 |  1 |     1 |  370 |
|         1 |  1 |     2 |  607 |
|         2 |  0 |     0 |  269 |
|         2 |  0 |     1 |  371 |
|         2 |  0 |     2 |  607 |
|         2 |  1 |     0 |  277 |
|         2 |  1 |     1 |  371 |
|         2 |  1 |     2 |  607 |
+-----------+----+-------+------+

So far, I have written the below code 到目前为止,我已经编写了以下代码

df.groupby(['ticket_id','id'])['code'].transform(lambda x: if df['level'] == 0, 0, df['code']) . df.groupby(['ticket_id','id'])['code'].transform(lambda x: if df['level'] == 0, 0, df['code'])

but I'm not able to get the correct out put. 但我无法正确输出。

my desired output is as below 我想要的输出如下

+-----------+----+-------+------+----------+
| ticket_id | id | level | code | MainCode |
+-----------+----+-------+------+----------+
|         1 |  0 |     0 | 1710 |        0 |
|         1 |  0 |     1 |  372 |     1710 |
|         1 |  0 |     2 |  607 |     1710 |
|         1 |  1 |     0 | 1727 |        0 |
|         1 |  1 |     1 |  370 |     1727 |
|         1 |  1 |     2 |  607 |     1727 |
|         2 |  0 |     0 |  269 |        0 |
|         2 |  0 |     1 |  371 |      269 |
|         2 |  0 |     2 |  607 |      269 |
|         2 |  1 |     0 |  277 |        0 |
|         2 |  1 |     1 |  371 |      277 |
|         2 |  1 |     2 |  607 |      277 |
+-----------+----+-------+------+----------+

please guide me to solve this 请指导我解决这个问题

You could check which values in level are different to 0 , and multiply the boolean result with the first value of the corresponding group, which can obtained taking the groupby.transform and aggregating with first : 您可以检查level中的哪些值不同于0 ,然后将布尔结果与相应组的第一个值相乘,这可以通过groupby.transform并与first聚合而获得:

df['MainCode'] = (df.level.ne(0)
                    .mul(df.groupby(['ticket_id','id']).code
                    .transform('first')))

    ticket_id  id  level  code  MainCode
0           1   0      0  1710         0
1           1   0      1   372      1710
2           1   0      2   607      1710
3           1   1      0  1727         0
4           1   1      1   370      1727
5           1   1      2   607      1727
6           2   0      0   269         0
7           2   0      1   371       269
8           2   0      2   607       269
9           2   1      0   277         0
10          2   1      1   371       277
11          2   1      2   607       277

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM