简体   繁体   English

如何为组内的每个值制作一列中位数?

[英]How to make a column of median for each value within groups?

I have three columns: A BatchID, UnitID, and Score. 我有三列:BatchID,UnitID和Score。

At the moment, the data set looks something like this: 目前,数据集如下所示:

BatchID      UnitID           Score
A123         A123-100         0.111
A123         A123-101         0.121
A123         A123-102         0.101
A123         A123-103         0.102
B456         B456-200         0.211
B456         B456-201         0.221
C789         C789-001         0.199
C789         C789-002         0.189
C789         C789-003         0.192
C789         C789-004         0.201
...          ...              ...

I want to add a column "median" that gets the median of the score each BATCH, and places it next to the rest of the data (repeating the same median value for each Unit in a unique Batch). 我想添加一列“中位数”,以获取每个BATCH的得分中位数,并将其放置在其余数据的旁边(对唯一批次中的每个单元重复相同的中位数)。 Something like this: 像这样:

BatchID      UnitID           Score      Median
A123         A123-100         0.111      0.1065
A123         A123-101         0.121      0.1065
A123         A123-102         0.101      0.1065
A123         A123-103         0.102      0.1065
B456         B456-200         0.211      0.2160
B456         B456-201         0.221      0.2160
C789         C789-001         0.199      0.1955
C789         C789-002         0.189      0.1955
C789         C789-003         0.192      0.1955
C789         C789-004         0.201      0.1955
...          ...              ...        ...

I tried groupby, among other things, but given that I don't really know how to use it in this case, that's not giving me the desired output. 除了其他方面,我尝试了groupby,但是鉴于我并不真正知道在这种情况下如何使用它,因此没有给我想要的输出。

Thank you! 谢谢!

Use groupby with transform : groupbytransform一起使用:

df['Median'] = df.groupby('BatchID')['Score'].transform('median')

Output: 输出:

  BatchID    UnitID  Score  Median
0    A123  A123-100  0.111  0.1065
1    A123  A123-101  0.121  0.1065
2    A123  A123-102  0.101  0.1065
3    A123  A123-103  0.102  0.1065
4    B456  B456-200  0.211  0.2160
5    B456  B456-201  0.221  0.2160
6    C789  C789-001  0.199  0.1955
7    C789  C789-002  0.189  0.1955
8    C789  C789-003  0.192  0.1955
9    C789  C789-004  0.201  0.1955

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM