[英]Python Pandas DataFrame - How to sum values in 1 column based on partial match in another column (date type)?
I have encountered some issues while processing my dataset using Pandas DataFrame.我在使用 Pandas DataFrame 处理我的数据集时遇到了一些问题。
Here is my dataset:这是我的数据集:
My data types are displayed below:我的数据类型显示如下:
My dataset is derived from:我的数据集来自:
MY_DATASET = pd.read_excel(EXCEL_FILE_PATH, index_col = None, na_values = ['NA'], usecols = "A, D") MY_DATASET = pd.read_excel(EXCEL_FILE_PATH, index_col = None, na_values = ['NA'], usecols = "A, D")
I would like to sum all values in the "NUMBER OF PEOPLE" column for each month in the "DATE" column.我想对“日期”列中每个月的“人数”列中的所有值求和。 For example, all values in "NUMBER OF PEOPLE" column would be added as long as the value in the "DATE" column was "2020-01", "2020-02" ...例如,只要“日期”列中的值为“2020-01”、“2020-02”...
However, I am stuck since I am unsure how to use the .groupby on partial match.但是,我被卡住了,因为我不确定如何在部分匹配中使用 .groupby。
After 1) is completed, I am also trying to convert the values in the "DATE" column from YYYY-MM-DD to YYYY-MMM, like 2020-Jan. 1) 完成后,我还尝试将“日期”列中的值从 YYYY-MM-DD 转换为 YYYY-MMM,例如 2020-Jan。
However, I am unsure if there is such a format.但是,我不确定是否有这样的格式。
Does anyone know how to resolve these issues?有谁知道如何解决这些问题?
Many thanks!非常感谢!
查看
s = df['NUMBER OF PEOPLE'].groupby(pd.to_datetime(df['DATE'])).dt.strftime('%Y-%b')).sum()
You can get an abbeviated month name using strftime('%b') but the month name will be all in lowercase:您可以使用 strftime('%b') 获得缩写的月份名称,但月份名称将全部为小写:
df['group_time'] = df.date.apply(lambda x: x.strftime('%Y-%B'))
If you need the first letter of the month in uppercase, you could do something like this:如果您需要大写月份的第一个字母,您可以执行以下操作:
df.group_date = df.group_date.apply(lambda x: f'{x[0:5]}{x[5].upper()}{x[6:]}'
# or in one step:
df['group_date']= df.date.apply(lambda x: x.strftime('%Y-%B')).apply(lambda x: f'{x[0:5]}
...: {x[5].upper()}{x[6:]}')
Now you just need to .groupby and .sum():现在你只需要 .groupby 和 .sum():
result = df['NUMBER OF PEOPLE'].groupby(df.group_date).sum()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.