简体   繁体   English

从 one-hot 编码的 dataframe 中获得胜率

[英]getting a win percentage from one-hot encoded dataframe

i have this one-hot encoded dataframe the win column is one if the observations won and 0 otherwise.我有这个单热编码的 dataframe 如果观察结果获胜,则获胜列为 1,否则为 0。 how do i get the win percentage for each one of the categories (food,fitness,retail,grocery)?我如何获得每个类别(食品、健身、零售、杂货)的获胜百分比? each observation can be in multiple categories and some ids are duplicated because each id is an experiment where multiple things can be tested.每个观察都可以属于多个类别,并且某些 id 是重复的,因为每个 id 都是可以测试多个事物的实验。

id food    fitness   retail    grocery win
1  1       0         1         1       1
2  1       0         0         0       0
3  0       1         0         0       1
4  1       0         0         1       1
4  1       0         0         1       0
5  1       0         1         0       1
6  0       1         1         0       1
6  0       1         1         0       0


expected output预计 output

category win_percentage
food     .6
fitness  .66
retail   .75
grocery  .66

You can do this:你可以这样做:

df1 = df.drop(['id','win'], 1)
win_percent = df1[df1 == 1].mul(df.win, 0).mean()

Output Output

win_percent

food       0.600000
fitness    0.666667
retail     0.750000
grocery    0.666667
dtype: float64

To get exactly your expected output:要准确获得您预期的 output:

win_percent.to_frame('win_percent').rename_axis('category').reset_index()

  category  win_percent
0     food     0.600000
1  fitness     0.666667
2   retail     0.750000
3  grocery     0.666667
category, win_perecentage = [], []
for c in df.loc[:, "food":"grocery"]:
    category.append(c)
    win_perecentage.append(df.loc[df[c] == 1, "win"].sum() / df[c].sum())

df_out = pd.DataFrame(
    {"category": category, "win_perecentage": win_perecentage}
)
print(df_out)

Prints:印刷:

  category  win_perecentage
0     food         0.600000
1  fitness         0.666667
2   retail         0.750000
3  grocery         0.666667

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM