简体   繁体   English

绘制 groupby 中唯一值的计数

[英]Plotting count of unique values in groupby

I have a dataset with that form:我有一个具有这种形式的数据集:

>>> df
         my_timestamp    disease  month
0 2016-01-01 15:00:00       2      jan
0 2016-01-01 11:00:00       1      jan
1 2016-01-02 15:00:00       3      jan  
2 2016-01-03 15:00:00       4      jan  
3 2016-01-04 15:00:00       2      jan  
  

I wont to count the number of unique apparition by month, by values, then plot the count of every value by month.我不会按月计算唯一幻影的数量,按值,然后 plot 按月计算每个值的计数。

df values count df 值计数
jan 2 3 jan 2 3 1 月 2 日 3 日 1 月 2 日 3

How can I plot it?我怎么能plot呢? In one plot with month on x axis, one line for every values, and their count on y在一个 plot 中,x 轴为月份,每个值一行,y 为计数

If you want to plot by month, then you also need to plot by year if multiple years.如果您想按月 plot,那么如果多年,您还需要按年 plot。 You can use dt.strftime when using .groupby to group by year and month.使用.groupby按年和月分组时,可以使用dt.strftime

Given the following slightly altered dataset to include more months:鉴于以下稍微更改的数据集以包含更多月份:

       my_timestamp  disease    month
2016-01-01 15:00:00       2      jan
2016-02-01 11:00:00       1      feb
2017-01-02 15:00:00       3      jan  
2017-01-02 15:00:00       4      jan  
2016-01-04 15:00:00       2      jan  

You can run the following您可以运行以下

df['my_timestamp'] = pd.to_datetime(df['my_timestamp'])
df.groupby(df['my_timestamp'].dt.strftime('%Y-%m'))['disease'].nunique().plot()

在此处输入图像描述

What I did to get that data into barplot.我做了什么来把这些数据变成条形图。 I created a month column.我创建了一个月专栏。 Then:然后:

for v in df.disease.unique():
   diseases = df_cut[df_cut['disease']==v].groupby('month_num')['disease'].count()
   x = diseases.index
   y = diseases.values
   plt.bar(x, y)
  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM