[英]How do I create a new column of max values of a column(corresponding to specific name) using pandas?
I'm wondering if it is possible to use Pandas to create a new column for the max values of a column (corresponding to different names, so that each name will have a max value).我想知道是否可以使用 Pandas 为列的最大值创建一个新列(对应于不同的名称,以便每个名称都有一个最大值)。
For an example:例如:
name value max
Alice 1 9
Linda 1 1
Ben 3 5
Alice 4 9
Alice 9 9
Ben 5 5
Linda 1 1
So for Alice, we are picking the max of 1, 4, and 9, which is 9. For Linda max(1,1) = 1, and for Ben max(3,5) = 5.所以对于 Alice,我们选择 1、4 和 9 中的最大值,即 9。对于 Linda max(1,1) = 1,对于 Ben max(3,5) = 5。
I was thinking of using .loc
to select the name == "Alice"
, then get the max value of these rows, then create the new column.我正在考虑使用.loc
选择name == "Alice"
,然后获取这些行的最大值,然后创建新列。 But since I'm dealing with a large dataset, this does not seem like a good option.但由于我正在处理一个大型数据集,这似乎不是一个好的选择。 Is there a smarter way to do this so that I don't need to know what specific names?有没有更聪明的方法来做到这一点,这样我就不需要知道具体的名字了?
groupby and taking a max gives the max by name, which is then merged with the original df groupby 并取一个 max 按名称给出最大值,然后将其与原始 df 合并
df.merge(df.groupby(['name'])['value'].max().reset_index(),
on='name').rename(
columns={'value_x' : 'value',
'value_y' : 'max'})
name value max
0 Alice 1 9
1 Alice 4 9
2 Alice 9 9
3 Linda 1 1
4 Linda 1 1
5 Ben 3 5
6 Ben 5 5
You could use transform
or map
您可以使用transform
或map
df['max'] = df.groupby('name')['value'].transform('max')
or或者
df['max'] = df['name'].map(df.groupby('name')['value'].max())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.