[英]Pandas histogram of number of occurences of other columns after groupby
I have a dataframe:我有一个 dataframe:
df = Batch_ID DateTime Code A1 A2
ABC. '2019-01-02 17:03:41.000' 230 2. 4
ABC. '2019-01-02 17:03:41.000' 230 1. 5
ABC. '2019-01-02 17:03:42.000' 231 1. 4
ABC. '2019-01-02 17:03:48.000' 232 2. 7
ABC. '2019-01-02 17:04:41.000' 230 2. 9
ABB. '2019-01-02 17:04:41.000' 235 5. 4
ABB. '2019-01-02 17:04:45.000' 236 2. 0
I need to generate an plot of an histogram of "number of different codes per <Batch_ID, minute>. Notice that 'Code' may have multiple occurrences but should be taken after unique.我需要生成一个 plot 的“每 <Batch_ID,分钟> 的不同代码数量”的直方图。请注意,“代码”可能多次出现,但应在唯一之后采用。
So in this case some entries will be:所以在这种情况下,一些条目将是:
<ABC, 2019-01-02 17:03> : 3
<ABC, 2019-01-02 17:04> : 1
<ABB, 2019-01-02 17:04> : 2
How can it be done?如何做呢?
Try this using pd.Grouper
on a datetime dtype column:在 datetime dtype 列上使用
pd.Grouper
试试这个:
df = pd.read_clipboard(sep='\s\s+')
df['DateTime'] = pd.to_datetime(df['DateTime'].str.strip("'"))
df.groupby(['Batch_ID', pd.Grouper(key='DateTime', freq='T')])['Code'].count().rename('Count').reset_index()
Output: Output:
Batch_ID DateTime Count
0 ABB. 2019-01-02 17:04:00 2
1 ABC. 2019-01-02 17:03:00 3
2 ABC. 2019-01-02 17:04:00 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.