Calculate the average date every x rows

Question

Previously I posted a question to calculate the average value for every 10 rows. It was successful in Zero's coding. Calculating the average value for every 10 cells in each column by pandas

However, there is an error in calculating the average value of the time date

import numpy as np

location2='C:\\Users\\Poon\\Downloads\\20211014_SBS_BEMS\\20211014_SBS_BEMS\\Test1044.csv'
csvfiles2=glob.glob(location2)

df3=pd.DataFrame()

for file_new_2 in csvfiles2: 
    df3=pd.read_csv(file_new_2)

    df4=pd.concat([pd.to_datetime(df3.iloc[:,0]), df3.iloc[:, 1:].apply(pd.to_numeric)], axis = 1)
    df4.dropna(inplace = True)
    df4= df4.groupby(np.arange(len(df4))//10).mean()

print(df4)

The error code is

Unable to parse string "2019-05-19 00:00:00" at position 0

I guess the commend pd.to_datetime cannot be summed up then divided by 10?

Here are some of the data from my excel, but totally there are 100k rows.

19/5/2019 0:00  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:01  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:02  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:03  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:04  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:05  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:06  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:07  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:08  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:09  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:10  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:11  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:12  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:13  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:14  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:15  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:16  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:17  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:18  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:19  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:20  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:21  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:22  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:23  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:24  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:25  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:26  8840    20  237 64.93   82.35   16.15   46.88
19/5/2019 0:27  8840    20  237 64.93   82.35   16.15   46.88

Answer 1

Assuming column 0 in the provided example, convert the timestamps to integer, group by the floor division of the index by 10 and aggregate:

import numpy as np
out = pd.to_datetime(pd.to_datetime(df[0])
                       .astype(np.int64)
                       .groupby(df.index//10)
                       .mean())

Output:

0   2019-05-19 00:04:30
1   2019-05-19 00:14:30
2   2019-05-19 00:23:30
Name: 0, dtype: datetime64[ns]

Answer 2

You can use resample :

>>> (df4.assign(**{'dt': pd.to_datetime(df.iloc[:, 0])})[1:]
        .resample('10T', on='dt').mean())

                          1     2      4      5      6
dt                                                    
2019-05-19 00:00:00  8840.0  20.0  82.35  16.15  46.88
2019-05-19 00:10:00  8840.0  20.0  82.35  16.15  46.88
2019-05-19 00:20:00  8840.0  20.0  82.35  16.15  46.88

It can be easier if your columns have names.

Calculate the average date every x rows

Question

2 answers

solution1
0 ACCPTED 2022-05-24 05:00:31

solution2
0 2022-05-24 05:06:23

Calculate the average date every x rows

Question

2 answers

solution1 0 ACCPTED 2022-05-24 05:00:31

solution2 0 2022-05-24 05:06:23

solution1
0 ACCPTED 2022-05-24 05:00:31

solution2
0 2022-05-24 05:06:23