簡體   English   中英

從 pandas dataframe 同時獲取最大值和具有最大值的索引

[英]Getting max value and index with max value at the same time from a pandas dataframe

假設我有以下 dataframe

  Country  Year  Count
0     USA  2021   1500
1     USA  2018   6000
2   India  2019   3000
3   India  2021   5000
4      UK  2019   4000
5     USA  2019   3200
6   India  2018   5000

我想打印以下內容

Entry with Max count is (USA, 2018, 6000)

Country with max total count is: (India, 13000)

Entry with max count in each year is:
2018, USA, 6000
2019, UK, 4000
2021, India, 5000

下面的代碼有效。 但是有幾個問題,看看我是否可以做得更好

  1. 有什么方法可以同時獲取最大索引和最大值,而不是獲取maxidx然后獲取其中的值?
  2. 獲得我想要的所有三個數量的任何更清潔和更簡單的方法?
# Print (country, year, count) of the row with max count among all entries
max_idx = df['Count'].idxmax()
print("Entry with Max count is (" + \
      str(df.loc[max_idx]['Country']) + ", " \
      + str(df.loc[max_idx]['Year']) + ", " \
      + str(df.loc[max_idx]['Count']) + ")" )

# Print country with max total count and print (country, max total count)
country_sum = pd.pivot_table(df, index='Country', aggfunc=np.sum)
print("\nCountry with max total count is: ("\
      + country_sum['Count'].idxmax() + ", "\
      + str(country_sum['Count'].max())\
      + ")")


# Print country with max count in each year
year_country_groupby = df.groupby('Year')
print('\nEntry with max count in each year is:')
for key, gdf in year_country_groupby:
    max_idx = gdf['Count'].idxmax()
    print(str(key) + ", "\
          + str(gdf.loc[max_idx]['Country']) + ", "\
          + str(df.loc[max_idx]['Count']))

您可以像這樣簡化您的 output:

# 1st output
cty, year, cnt = df.loc[df['Count'].idxmax()]
print(f"Entry with Max count is ({cty}, {year}, {cnt})")

# 2nd output
cty, cnt = df.groupby('Country')['Count'].sum().nlargest(1).reset_index().squeeze()
print(f"Country with max total count is: ({cty}, {cnt})")

# 3rd output
print("Entry with max count in each year is:")
for _, (cty, year, cnt) in df.loc[df.groupby('Year')['Count'].idxmax()].iterrows():
    print(f"{year}, {cty}, {cnt}")

Output:

Entry with Max count is (USA, 2018, 6000)

Country with max total count is: (India, 13000)

Entry with max count in each year is:
2018, USA, 6000
2019, UK, 4000
2021, India, 5000

更新要同時獲取最大索引和值,您可以使用agg

idxmax, valmax = df['Count'].agg(['idxmax', 'max'])
print(idxmax, valmax)

# Output:
1 6000

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM