[英]Count the unique values for the combined columns, and put them in a dataframe
I am trying to count the unique values for the combined columns, and put them in a dataframe, I have two columns.我正在尝试计算组合列的唯一值,并将它们放在 dataframe 中,我有两列。 One column called 'Municipality' and the other 'Date'.一列称为“市政”,另一列称为“日期”。 The Municipality has 27 different names while Date has 151 dates for each Municipality, 4,077 rows all together.市镇有 27 个不同的名称,而日期有每个市镇的 151 个日期,总共 4,077 行。 I can put these two in a data frame but I can not get the count.我可以将这两个放在一个数据框中,但我无法得到计数。 ie IE
days1 = (df['Municipality'])
days = days1[5247:9324].reset_index(drop=True)
ddate1 = (df['Date'])
ddate = ddate1[5247:9324].reset_index(drop=True)
frames = [days, ddate]
result = pd.concat(frames, axis = 1)
result
Municipality Date
0 Alta Floresta D'Oeste 2020-03-27
1 Alta Floresta D'Oeste 2020-03-28
2 Alta Floresta D'Oeste 2020-03-29
3 Alta Floresta D'Oeste 2020-03-30
4 Alta Floresta D'Oeste 2020-03-31
... ... ...
4072 Alto Paraíso 2020-08-20
4073 Alto Paraíso 2020-08-21
4074 Alto Paraíso 2020-08-22
4075 Alto Paraíso 2020-08-23
4076 Alto Paraíso 2020-08-24
4077 rows × 2 columns
the goal is to have each Municipality (27) and count the dates for each Municipality which should be 151 for each.目标是拥有每个自治市 (27) 并计算每个自治市的日期,每个自治市应为 151。 New to this so thanks for any help.对此很陌生,所以感谢您的帮助。
I have a feeling you're looking for groupby.transform
.我有一种感觉,您正在寻找groupby.transform
。 With this, you will add a column that will count the dates, for each municipality.这样,您将添加一个列来计算每个市镇的日期。
import pandas as pd
result['date_count'] = result.groupby('Municipality')['Date'].transform('count')
result
Municipality Date date_count
0 Alta Floresta D'Oeste 2020-03-27 5
1 Alta Floresta D'Oeste 2020-03-28 5
2 Alta Floresta D'Oeste 2020-03-29 5
3 Alta Floresta D'Oeste 2020-03-30 5
4 Alta Floresta D'Oeste 2020-03-31 5
5 Alto Paraíso 2020-08-20 5
6 Alto Paraíso 2020-08-21 5
7 Alto Paraíso 2020-08-22 5
8 Alto Paraíso 2020-08-23 5
9 Alto Paraíso 2020-08-24 5
In your own dataset, the 'date_count' column should say 151. You can have a further ready about groupby.transform
here .在您自己的数据集中,“date_count”列应为 151。您可以在此处进一步准备好groupby.transform
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.