简体   繁体   中英

Converting dataframe column from object to timedelta and summing

I have a Pandas dataframe that I'm segregating by month:

months = [g for n, g in df.groupby(pd.Grouper(key='DATE',freq='M'))]

I'm then taking each month and summing the integers in a columns called PARTS RUN .

parts_run_month_sum = months[month]['PARTS RUN'].sum()

This is all working fine. The last thing I need to do is add the hours together from column HOURS RUN (HR:MIN) . This column is an object data type. The data itself it a timedelta format, not datetime. The format is like so: 02:11:40 , being hours:minutes:seconds.

If I run the below line of code, it prints the correct index numbers related to the number of lines there happen to be for each month:

for run_time in range(len(months[month]['HOURS RUN (HR:MIN)'])):
    print(run_time)

But if I try to get the lines of times themselves I receive a KeyError: 0 , although there is a key of 0 in each month that is returned in the above example.

for run_time in range(len(months[month]['HOURS RUN (HR:MIN)'])):
    print(months[month]['HOURS RUN (HR:MIN)'][run_time])

What I'm actually looking for is how to sum the column of times, but because they are objects I cannot do this.

How can I convert a column with format hours:minutes:seconds to timedelta and sum the times?

I think you need:

df['HOURS RUN (HR:MIN)'] = pd.to_timedelta(df['HOURS RUN (HR:MIN)'])

#if values are times
df['HOURS RUN (HR:MIN)'] = pd.to_timedelta(df['HOURS RUN (HR:MIN)'].astype(str))

I think instead your solution is possible aggregate sum :

df1 = df.groupby(pd.Grouper(key='DATE',freq='M'))['HOURS RUN (HR:MIN)'].sum()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM