This is my newly everyday thing where I deal with a df
with many columns including these two columns: user
and event
. I count number of event
for each user
, add a new column, count
to the original df
. Then, I only keep user
and count
where there are multiple identical rows then do drop_duplicates()
to drop the duplicates and thus obtain event count
for each user
. I'm sure I'm doing some redundant work.
What would be an elegant way to do such tasks.
df['count'] = df.groupby('user')['event'].transform('count')
df = df[['user','count']]
df = df.drop_duplicates()
plt.bar(x=df['user'], height=df['count'])
将GroupBy.count
用于Series
,然后调用Series.plot.bar
:
df.groupby('user')['event'].count().plot.bar()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.