簡體   English   中英

Pandas 按列分組並對特定列執行聚合

[英]Pandas group by columns and perform aggregate on specific columns

我有下面的 dataframe。

df1 = pd.DataFrame({'col1': ["A", "X", "E", "A", "X", "X", "X"],
                       'col2': ["B", "Y", "E", "B", "Y","Y","Y"],
                       'col3': ["C", "Z", "E", "C", "Z", "Z", "Z"],
                       'col4': ["D", "A", "F", "D","A", "A","A"],
    'Sex':["Male","Male","Male","Female","Female","Null","Male"],
    'Count':[100,50,100,50,50,10,100],
    'Sum_me':[100,200,1,400,300,500,500],
    'Avg_me':[ 100,200,1,400,300,500,500]
    })

僅按列 col1、col2、col3、col4 過濾重復行后。 Dataframe 如下所示。

columns = ['col1', 'col2', 'col3','col4']
df1 = df1[df1[columns].duplicated(keep=False)].sort_values('col1').reset_index(drop=True)

    col1    col2    col3    col4    Sex   Count Sum_me  Avg_me
0   A       B        C      D       Male    100 100     100
1   A       B        C      D       Female  50  400     400
2   X       Y        Z      A       Male    50  200     200
3   X       Y        Z      A       Female  50  300    300
4   X       Y        Z      A       Null    10  500    500
5   X       Y        Z      A       Male    100 500    500

我正在嘗試對 Sum_me 和 Avg_me 列執行聚合,並且我還想通過從匹配性別列的計數列中獲取記錄來創建一個新列,比如 total_male、total_female 和 null。 total_male_female 是男性、女性和 null 的總和,我嘗試了下面的代碼,但沒有給出預期的結果

result_df = df.groupby(columns).agg({'Sum_me':'sum','Avg_me':'mean'}).reset_index()

下面是我預期的 output。 有沒有辦法使用 pandas 來做到這一點,任何幫助將不勝感激。

output:

col1 col2 col3 col4 total_male  total_female null total_male_female Sum_me  Avg_me
A    B    C    D    100         50            0     150              500    250
X    Y    Z    A    150         50            10    210              1500   376

嘗試:

x = df1.pivot_table(
    index=["col1", "col2", "col3", "col4"],
    columns="Sex",
    values="Count",
    aggfunc="sum",
    fill_value=0,
)

g = df1.groupby(["col1", "col2", "col3", "col4"])

out = pd.concat(
    [x, g["Sum_me"].sum(), g["Avg_me"].mean()], axis=1
).reset_index()

print(out)

印刷:

  col1 col2 col3 col4  Female  Male  Null  Sum_me  Avg_me
0    A    B    C    D      50   100     0     500   250.0
1    X    Y    Z    A      50   150    10    1500   375.0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM