繁体   English   中英

从 pandas dataframe 同时获取最大值和具有最大值的索引

[英]Getting max value and index with max value at the same time from a pandas dataframe

假设我有以下 dataframe

  Country  Year  Count
0     USA  2021   1500
1     USA  2018   6000
2   India  2019   3000
3   India  2021   5000
4      UK  2019   4000
5     USA  2019   3200
6   India  2018   5000

我想打印以下内容

Entry with Max count is (USA, 2018, 6000)

Country with max total count is: (India, 13000)

Entry with max count in each year is:
2018, USA, 6000
2019, UK, 4000
2021, India, 5000

下面的代码有效。 但是有几个问题,看看我是否可以做得更好

  1. 有什么方法可以同时获取最大索引和最大值,而不是获取maxidx然后获取其中的值?
  2. 获得我想要的所有三个数量的任何更清洁和更简单的方法?
# Print (country, year, count) of the row with max count among all entries
max_idx = df['Count'].idxmax()
print("Entry with Max count is (" + \
      str(df.loc[max_idx]['Country']) + ", " \
      + str(df.loc[max_idx]['Year']) + ", " \
      + str(df.loc[max_idx]['Count']) + ")" )

# Print country with max total count and print (country, max total count)
country_sum = pd.pivot_table(df, index='Country', aggfunc=np.sum)
print("\nCountry with max total count is: ("\
      + country_sum['Count'].idxmax() + ", "\
      + str(country_sum['Count'].max())\
      + ")")


# Print country with max count in each year
year_country_groupby = df.groupby('Year')
print('\nEntry with max count in each year is:')
for key, gdf in year_country_groupby:
    max_idx = gdf['Count'].idxmax()
    print(str(key) + ", "\
          + str(gdf.loc[max_idx]['Country']) + ", "\
          + str(df.loc[max_idx]['Count']))

您可以像这样简化您的 output:

# 1st output
cty, year, cnt = df.loc[df['Count'].idxmax()]
print(f"Entry with Max count is ({cty}, {year}, {cnt})")

# 2nd output
cty, cnt = df.groupby('Country')['Count'].sum().nlargest(1).reset_index().squeeze()
print(f"Country with max total count is: ({cty}, {cnt})")

# 3rd output
print("Entry with max count in each year is:")
for _, (cty, year, cnt) in df.loc[df.groupby('Year')['Count'].idxmax()].iterrows():
    print(f"{year}, {cty}, {cnt}")

Output:

Entry with Max count is (USA, 2018, 6000)

Country with max total count is: (India, 13000)

Entry with max count in each year is:
2018, USA, 6000
2019, UK, 4000
2021, India, 5000

更新要同时获取最大索引和值,您可以使用agg

idxmax, valmax = df['Count'].agg(['idxmax', 'max'])
print(idxmax, valmax)

# Output:
1 6000

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM