简体   繁体   中英

Taking average of Dataframe, Object values not printing

I have a Data frame as follows:

Date Time U V
2023-01-03 10:00:21:08 5 8
2023-01-03 10:00:25:08 2 3
2023-01-03 10:00:27:08 3 3
2023-01-03 10:00:29:08 6 6
2023-01-03 10:00:31:08 4 6
2023-01-03 10:00:35:08 9 7
2023-01-03 10:00:37:08 10 4
2023-01-03 10:00:39:08 11 12

I am trying to average 2 values and printing the results for the averages of the Date, Time, U and V.

I have done this:

data_columns=['Date','Time','U','V']
df3= data.groupby(np.arange(len(data))//2).mean()
print(df3)

However, my numeric values average and print, but my Date and Time don't. I know thats because df.groupby function only works on the numeric values and the Date and Time types are objects. How do I get the date and time to print such that the initial time prints for the averaging value. Like so:

Date | Time |U  | V|
2023-01-03 10:00:21:08 3.5 5.5 
2023-01-03 10:00:27:08 4.5 4.5
2023-01-03 10:00:31:08 6.5 6.5
2023-01-03 10:00:37:08 10.5 8

` ** in reality its...np.arrange(len(data)//300).mean()

IIUC, the groupby() function uses with a column name or a list of column name like

groupby('Date') or groupby(['Date','Time'])

UPDATE After discuss with @doctorstrange, I would like to update the answer. The code below try to create a dummy data:

import pandas as pd
from datetime import datetime
import time
import random
logfile='log.csv'
c=0

with open(logfile,'a') as handle:
    handle.write('Date,Time,U,V\n')
    while c<=100:
        now=datetime.now()
        D=now.strftime('%Y-%m-%d')
        T=now.strftime('%H:%M:%S:%f')
        u=random.randint(1, 20)
        v=random.randint(1, 20)
        handle.write(f'{D},{T},{u},{v}\n')
        c=c+1
        time.sleep(1)

The data is the pandas dataframe, like this

Date Time U V
0 2023-01-13 11:31:43:481765 15 8
1 2023-01-13 11:31:44:485698 13 11
2 2023-01-13 11:31:45:489411 7 11
3 2023-01-13 11:31:46:492592 6 11
4 2023-01-13 11:31:47:496405 18 17
... ... ... ... ...
96 2023-01-13 11:33:19:812832 10 5
97 2023-01-13 11:33:20:812896 5 15
98 2023-01-13 11:33:21:816767 16 9
99 2023-01-13 11:33:22:818553 6 8
100 2023-01-13 11:33:23:818651 13 6

The data was writen into logfile , 'log.csv' file. Read into the dataframe. data variable.

data=pd.read_csv(logfile)

Create a new column timestamp by apply the pd.Timestamp.combine() which combine Date and Time .

data['timestamp']=data.apply(lambda x: pd.Timestamp.combine(
     datetime.strptime(x['Date'],'%Y-%m-%d')
    ,datetime.strptime(x['Time'],'%H:%M:%S:%f').time()
   ), axis=1)

You will get this,

Date Time U V timestamp
0 2023-01-13 11:31:43:481765 15 8 2023-01-13 11:31:43.481765
1 2023-01-13 11:31:44:485698 13 11 2023-01-13 11:31:44.485698
2 2023-01-13 11:31:45:489411 7 11 2023-01-13 11:31:45.489411
3 2023-01-13 11:31:46:492592 6 11 2023-01-13 11:31:46.492592
4 2023-01-13 11:31:47:496405 18 17 2023-01-13 11:31:47.496405
... ... ... ... ... ...
96 2023-01-13 11:33:19:812832 10 5 2023-01-13 11:33:19.812832
97 2023-01-13 11:33:20:812896 5 15 2023-01-13 11:33:20.812896
98 2023-01-13 11:33:21:816767 16 9 2023-01-13 11:33:21.816767
99 2023-01-13 11:33:22:818553 6 8 2023-01-13 11:33:22.818553
100 2023-01-13 11:33:23:818651 13 6 2023-01-13 11:33:23.818651

Then, as you use df3 variable, the code below use resample() function with rule='min' (every minute) on the timestamp column and set the offset by 0s (zero second). Then calculate the mean() and then reset the dataframe index, set the name of the column to U_mean

df3=data.resample(rule='min', on='timestamp', offset='0s').U.mean().reset_index(name='U_mean')
df3

The first result like this

timestamp U_mean
0 2023-01-13 11:31:00 9.823529
1 2023-01-13 11:32:00 10.783333
2 2023-01-13 11:33:00 8.250000

Lastly, for df4 for V mean

df4=data.resample(rule='min', on='timestamp', offset='0s').V.mean().reset_index(name='V_mean')
df4

Then merge df3 and df4 on timestamp you will get the final result

pd.merge(df3,df4, on='timestamp')

Finally, the result is

timestamp U_mean V_mean
0 2023-01-13 11:31:00 9.823529 11.117647
1 2023-01-13 11:32:00 10.783333 10.716667
2 2023-01-13 11:33:00 8.250000 12.541667

Hope this help.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM