Im using jupyter notebooks, my current dataframe looks like the following:
players_mentioned | tweet_text | polarity
______________________________________________
[Mane, Salah] | xyz | 0.12
[Salah] | asd | 0.06
How can I group all players individually and average their polarity?
Currently I have tried to use:
df.groupby(df['players_mentioned'].map(tuple))['polarity'].mean()
But this will return a dataframe grouping all the mentions when together as well as separate, how best can I go about splitting the players up and then grouping them back together.
An expected output would contain
player | polarity_average
____________________________
Mane | 0.12
Salah | 0.09
In other words how to group by each item in the lists in every row.
如果您只是想按players_提到的分组并获得该球员受欢迎度得分的平均值,则应该这样做。
df.groupby('players_mentioned').polarity.agg('mean')
you can use the unnesting
idiom from this answer .
def unnesting(df, explode):
idx = df.index.repeat(df[explode[0]].str.len())
df1 = pd.concat([
pd.DataFrame({x: np.concatenate(df[x].values)}) for x in explode], axis=1)
df1.index = idx
return df1.join(df.drop(explode, 1), how='left')
You can now call groupby
on the unnested "players_mentioned" column.
(unnesting(df, ['players_mentioned'])
.groupby('players_mentioned', as_index=False).mean())
players_mentioned polarity
0 Mane 0.12
1 Salah 0.09
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.