简体   繁体   中英

How to do this steps better in Pandas: count, drop columns, drop duplicates

This is my newly everyday thing where I deal with a df with many columns including these two columns: user and event . I count number of event for each user , add a new column, count to the original df . Then, I only keep user and count where there are multiple identical rows then do drop_duplicates() to drop the duplicates and thus obtain event count for each user . I'm sure I'm doing some redundant work.

What would be an elegant way to do such tasks.

df['count'] = df.groupby('user')['event'].transform('count')
df = df[['user','count']]
df = df.drop_duplicates()
plt.bar(x=df['user'], height=df['count'])

GroupBy.count用于Series ,然后调用Series.plot.bar

df.groupby('user')['event'].count().plot.bar()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM