如何對由另一列分組的列中的值求和

Question

我有這個 CSV 文件：

ID,NAME,CITY,COUNTRY,CPERSON,EMPLCNT,CONTRCNT,CONTRCOST
00000001,Breadpot,Sydney,Australia,Sam.Keng@info.com,250,48,1024.00
00000002,Hoviz,Manchester,UK,harry.ham@hoviz.com,150,7,900.00
00000003,Hoviz,London,UK,hamlet.host@hoviz.com,1500,12800,10510.50
00000004,Grenns,London,UK,grenns@grenns.com,200,12800,128.30
00000005,Magnolia,Chicago,USA,man@info.com,1024,25600,512000.00
00000006,Dozen,San Francisco,USA,dozen@dozen.com,1000,5,1000.20
00000007,Sun,San Francisco,USA,sunny@sun.com,2000,2,10000.01

我要做的是找到 CONTRCNT 數量最多的 COUNTRY。 有些國家在 dataframe 中出現了不止一次，所以我需要找到 CONTRCNT 總和最大的國家。

我考慮過總結所有國家/地區的 CONTRCNT，然后找到最大的一個，但我想以一種不是蠻力的方式做到這一點。 我實際上想知道如何使用 Pandas 的 groupby function 來解決這個問題。

Answer 1

所以你可以用 sum groupby然后做idxmax

df.groupby('COUNTRY')['CONTRCOST'].sum().idxmax()

然后

s = df.groupby('COUNTRY')['CONTRCOST'].sum()
s[s==s.max()]

Answer 2

你可以試試這個

import pandas as pd

df = pd.read_csv("data.csv")
print(df,"\n")

country = df.groupby('COUNTRY')['CONTRCNT'].sum()
country = country[country==country.max()]
print(country,"\n")

# Once groupby is used, the particular columns becomes index, so it can be accessed using below statement
print(country.index.values, "\n")

# Index is used as -1 in case there are multiple data with same value, and data is sorted and we will be needing last data value only
print("Country with the largest number of customers' contracts:", country.index.values[-1], "({} contracts)".format(country[-1]))

這應該給你想要的 output

如何對由另一列分組的列中的值求和

問題描述

2 個解決方案

解決方案1
1 2021-05-21 02:26:25

解決方案2
0 2022-07-31 16:30:37

如何對由另一列分組的列中的值求和

問題描述

2 個解決方案

解決方案1 1 2021-05-21 02:26:25

解決方案2 0 2022-07-31 16:30:37

解決方案1
1 2021-05-21 02:26:25

解決方案2
0 2022-07-31 16:30:37