簡體   English   中英

根據條件在groupby數據框上創建匯總表

[英]Creating summary table on groupby dataframe based on condition

我有一個看起來像的熊貓數據框df

userid  trip_id segmentid   actual  prediction
  1       13       40          3       3
  1       6        2           1       1
  1       44       3           2       3
  2       70       19          1       1
  2       12       5           0       0

我需要創建一個匯總數據框dfsummary ,該數據框按列userid分組 ,具有三列userid,correct_classified,corrected_classified。 如果實際值和預測值相同,則將其正確分類,否則將分類錯誤。

我可以將對整個數據框的correct_classfied視為

correct_classified = submission[(submission['Actual'] == submission['prediction'])]
incorrect_classified = submission[(submission['Actual'] != submission['prediction'])]

但是不知道創建按用戶ID分組的摘要表的想法,它應該像這樣

userid  correct_classified  incorrect_classified
  1             2                    1
  2             2                    0

您可以在創建條件數組后使用pd.crosstab

flags = np.where(df['actual'].eq(df['prediction']), 'correct', 'incorrect')

res = pd.crosstab(df['userid'], flags)

print(res)

col_0   correct  incorrect
userid                    
1             2          1
2             2          0

您也可以使用pivot table

m = df['actual']==df['prediction']

# assign the conditions to new columns and aggregate.  
df.assign(correct_classified=m,incorrect_classified=~m).pivot_table(index='userid',
                                                                    aggfunc='sum',
                                                                    values=['correct_classified',
                                                                            'incorrect_classified'])

輸出:

     correct_classified  incorrect_classified
userid                                          
1                      2.0                   1.0
2                      2.0                   0.0

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM