How to do this steps better in Pandas: count, drop columns, drop duplicates

Question

This is my newly everyday thing where I deal with a df with many columns including these two columns: user and event . I count number of event for each user , add a new column, count to the original df . Then, I only keep user and count where there are multiple identical rows then do drop_duplicates() to drop the duplicates and thus obtain event count for each user . I'm sure I'm doing some redundant work.

What would be an elegant way to do such tasks.

df['count'] = df.groupby('user')['event'].transform('count')
df = df[['user','count']]
df = df.drop_duplicates()
plt.bar(x=df['user'], height=df['count'])

Answer 1

将GroupBy.count用于Series ，然后调用Series.plot.bar ：

df.groupby('user')['event'].count().plot.bar()

How to do this steps better in Pandas: count, drop columns, drop duplicates

Question

1 answers

solution1
0 ACCPTED 2018-12-07 09:41:11

How to do this steps better in Pandas: count, drop columns, drop duplicates

Question

1 answers

solution1 0 ACCPTED 2018-12-07 09:41:11

solution1
0 ACCPTED 2018-12-07 09:41:11