简体   繁体   English

Python 按数组列分组,按行号分区

[英]Python group by array column and partition by row number

I have an array as below我有一个数组如下

[('test1@test.com', '220104'), ('test2@test.com', '220104'), ('test3@test.com', '220106'), ('test4@test.com', '220106')]

Here the email ID will be always unique.此处 email ID 将始终是唯一的。 But the date may come same.但日期可能会一样。 The date format will be always YYMMDD.日期格式将始终为 YYMMDD。 So, the number will change once the date changes所以,一旦日期改变,数字就会改变

I need output in short as below.我需要 output 简而言之,如下所示。 The third column basically represents the number of emails of that day第三列基本代表当天的邮件数量

'test1@test.com', '220104', 1
'test2@test.com', '220104', 2
'test3@test.com', '220106', 1
'test4@test.com', '220106', 2

so if the 5th email come, on the same date as the 6th then the output will be因此,如果第 5 个 email 在与第 6 个相同的日期出现,那么 output 将是

'test1@test.com', '220104', 1
'test2@test.com', '220104', 2
'test3@test.com', '220106', 1
'test4@test.com', '220106', 2
'test4@test.com', '220106', 3

You can use cumcount here:您可以在此处使用cumcount

df = pd.DataFrame([('test1@test.com', '220104'), ('test2@test.com', '220104'), ('test3@test.com', '220106'), ('test4@test.com', '220106')], columns=['email','date'])
df['date_grouped_index'] = df.groupby('date').cumcount() + 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM