[英]Pivot specific rows to columns based on certain conditions in pandas
This is the dataframe that I working with where there can be multiple customers associated with a certain case ID on different months (case_ID, cust_val,date are PK in this table).这是我使用的 dataframe,其中可能有多个客户与不同月份的某个案例 ID 相关联(case_ID、cust_val、date 在此表中是 PK)。
case_ID| cust_val | date | primary | action | change |
1 | xx | 3/2 | 1 | | increase |
1 | xx | 3/2 | | 1 | decrease |
1 | xx | 3/1 | 1 | | decrease |
1 | xx | 3/1 | | 1 | decrease |
1 | yy | 3/2 | 1 | | decrease |
1 | yy | 3/2 | | 1 | increase |
2 | yy | 3/2 | | 1 | increase |
2 | yy | 3/2 | 1 | | increase |
I want the output table to look like this, where for each case_ID, cust_val,date, the change associated with the primary and action are in one row:我希望 output 表看起来像这样,其中对于每个 case_ID、cust_val、date,与主要和操作相关的更改都在一行中:
case_ID| cust_val | date | primary_change | action_change |
1 | xx | 3/2 | increase | decrease |
1 | xx | 3/1 | decrease | decrease |
1 | yy | 3/2 | decrease | increase |
2 | yy | 3/2 | increase | increase |
I tried this but this is obviosuly wrong and I am not sure how to solve this:我试过这个,但这显然是错误的,我不知道如何解决这个问题:
df.pivot(index=['case_ID','cust_val','date'], columns=['primary', 'action'], values='change').reset_index()
Any help is appreciated.任何帮助表示赞赏。 Thanks in advance.提前致谢。
You can filter the dataframe and merge:您可以过滤 dataframe 并合并:
a = df[df.primary == "1"] # <-- change "1" to 1 if the values are integers
b = df[df.action == "1"]
x = (
pd.merge(a, b, on=["case_ID", "cust_val", "date"])
.rename(columns={"change_x": "primary_change", "change_y": "action_change"})
.drop(columns=["primary_x", "action_x", "primary_y", "action_y"])
)
print(x)
Prints:印刷:
case_ID cust_val date primary_change action_change
0 1 xx 3/2 increase decrease
1 1 xx 3/1 decrease decrease
2 1 yy 3/2 decrease increase
3 2 yy 3/2 increase increase
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.