[英]how to sum of columns based on another column value of excel
I would like to ask how to sum using python or excel.我想问一下如何使用python或excel进行求和。 Like to do summation of "number" columns based on "time" column.
喜欢根据“时间”列对“数字”列进行求和。 Sum of the Duration for (00:00 am - 00:59 am) is (2+4) 6. Sum of the Duration for (02:00 am - 02:59 am) is (3+1) 4. Could you please advise how to ?
(00:00 am - 00:59 am) 的持续时间总和是 (2+4) 6. (02:00 am - 02:59 am) 的持续时间总和是 (3+1) 4. 你能吗请指教如何?
When you have a dataframe you can use groupby to accomplish this:当您有一个数据框时,您可以使用 groupby 来完成此操作:
# import pandas module
import pandas as pd
# Create a dictionary with the values
data = {
'time' : ["12:20:51", "12:40:51", "2:26:35", "2:37:35"],
'number' : [2, 4, 3, 1]}
# create a Pandas dataframe
df = pd.DataFrame(data)
# or load the CSV
df = pd.read_csv('path/dir/filename.csv')
# Convert time column to datetime data type
df['time'] = df['time'].apply(pd.to_datetime, format='%H:%M:%S')
# add values by hour
dff = df.groupby(df['time'].dt.hour)['number'].sum()
print(dff.head(50))
output:输出:
time
12 6
2 4
When you need more than one column.当您需要多于一列时。 You can pass the columns as a list inside .groupby().
您可以在 .groupby() 中将列作为列表传递。 The code will look like this:
代码如下所示:
import pandas as pd
df = pd.read_csv('filename.csv')
# Convert time column to datetime data type
df['time'] = df['time'].apply(pd.to_datetime, format='%H:%M:%S')
df['date'] = df['date'].apply(pd.to_datetime, format='%d/%m/%Y')
# add values by hour
dff = df.groupby([df['date'], df['time'].dt.hour])['number'].sum()
print(dff.head(50))
# save the file
dff.to_csv("filename.csv")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.