[英]Failed to count occurrence of values in dataFrame based on few columns with groupby
I have pandas dataframe:我有熊猫数据框:
id colA colB colC
194 1 0 1
194 1 1 0
194 2 1 3
195 1 1 2
195 0 1 0
197 1 1 2
i would to calculate occurrence of each value group by id.我会按 id 计算每个值组的出现。 in my case, expected result is:就我而言,预期结果是:
id countOfValue0 countOfValue1 countOfValue2 countOfValue3
194 2 3 1 1
195 1 2 1 0
197 0 1 1 0
if value appeared in same row - distinct value by row (this is why i have for id=194, value1 = 3) i thought to separate the data to 3 data frames using group by id-colA, id-colB, id-colC something like = df.groupby('id', 'colaA') but i can't find an proper way to calculate those dataframe values based on id.如果值出现在同一行 - 逐行不同的值(这就是为什么我有 id=194,value1 = 3)我想使用 group by id-colA, id-colB, id-colC 将数据分成 3 个数据帧类似于 = df.groupby('id', 'colaA') 但我找不到根据 id 计算这些数据帧值的正确方法。 probably there is more efficient way for doing this可能有更有效的方法来做到这一点
Try:尝试:
res=df.set_index("id", append=True).stack()\
.reset_index(level=0).reset_index(level=1,drop=True)\
.drop_duplicates().assign(_dummy=1)\
.rename(columns={0: "countOfValue"})\
.pivot_table(index="id", columns="countOfValue", values="_dummy", aggfunc="sum")\
.fillna(0).astype(int)
res=res.add_prefix("countOfValue")
del res.columns.name
Outputs:输出:
countOfValue0 ... countOfValue3
id ...
194 2 ... 1
195 1 ... 0
197 0 ... 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.