I am trying to assign a Column
to an existing df
. Specifically, certain timestamps get sorted but the current export is a separate series
. I'd like to append this to the df
.
import pandas as pd
d = ({
'time' : ['08:00:00 am','12:00:00 pm','16:00:00 pm','20:00:00 pm','2:00:00 am','13:00:00 pm','3:00:00 am'],
'code' : ['A','B','C','A','B','C','A'],
})
df = pd.DataFrame(data=d)
df['time'] = pd.to_timedelta(df['time'])
cutoff, day = pd.to_timedelta(['3.5H', '24H'])
df.time.apply(lambda x: x if x > cutoff else x + day).sort_values().reset_index(drop=True)
x = df.time.apply(lambda x: x if x > cutoff else x + day).sort_values().reset_index(drop=True).dt.components
x = x.apply(lambda x: '{:02d}:{:02d}:{:02d}'.format(x.days*24+x.hours, x.minutes, x.seconds), axis=1)
Output:
0 08:00:00
1 12:00:00
2 13:00:00
3 16:00:00
4 20:00:00
5 26:00:00
6 27:00:00
I've altered
df['time'] = x.apply(lambda x: '{:02d}:{:02d}:{:02d}'.format(x.days*24+x.hours, x.minutes, x.seconds), axis=1)
But this produces
time code
0 08:00:00 A
1 12:00:00 B
2 13:00:00 C
3 16:00:00 A
4 20:00:00 B
5 26:00:00 C
6 27:00:00 A
As you can see. The timestamps aren't aligned with their respective values after sorting.
The intended output is:
time code
0 08:00:00 A
1 12:00:00 B
2 13:00:00 C
3 16:00:00 C
4 20:00:00 A
5 26:00:00 B
6 27:00:00 A
I hope this is what you want:
import pandas as pd
d = ({
'time' : ['08:00:00 am','12:00:00 pm','16:00:00 pm','20:00:00 pm','2:00:00 am','13:00:00 pm','3:00:00 am'],
'code' : ['A','B','C','A','B','C','A'],
})
df = pd.DataFrame(data=d)
df['time'] = pd.to_timedelta(df['time'])
cutoff, day = pd.to_timedelta(['3.5H', '24H'])
df.time.apply(lambda x: x if x > cutoff else x + day).sort_values().reset_index(drop=True)
print(df)
x = df.time.apply(lambda x: x if x > cutoff else x + day).sort_values().reset_index(drop=True).dt.components
df['time'] = x.apply(lambda x: '{:02d}:{:02d}:{:02d}'.format(x.days*24+x.hours, x.minutes, x.seconds), axis=1)
print(df)
Remove reset_index(drop=True) from your code and sort later may work for you.
import pandas as pd
d = ({
'time' : ['08:00:00 am','12:00:00 pm','16:00:00 pm','20:00:00 pm','2:00:00 am','13:00:00 pm','3:00:00 am'],
'code' : ['A','B','C','A','B','C','A'],
})
df = pd.DataFrame(data=d)
df['time'] = pd.to_timedelta(df['time'])
cutoff, day = pd.to_timedelta(['3.5H', '24H'])
x = df.time.apply(lambda x: x if x > cutoff else x + day).dt.components
df['time'] = x.apply(lambda x: '{:02d}:{:02d}:{:02d}'.format(x.days*24+x.hours, x.minutes, x.seconds), axis=1)
df = df.sort_values('time')
print(df)
Pandas do alignment via index. reset_index(drop=True) destructed the original index and caused the sorted time column assigned back sequentially. This is probably why you didn't get what you what.
original time column.
0 08:00:00
1 12:00:00
2 16:00:00
3 20:00:00
4 02:00:00
5 13:00:00
6 03:00:00
after sort_values().
4 02:00:00
6 03:00:00
0 08:00:00
1 12:00:00
5 13:00:00
2 16:00:00
3 20:00:00
after reset_index(drop=True)
0 02:00:00
1 03:00:00
2 08:00:00
3 12:00:00
4 13:00:00
5 16:00:00
6 20:00:00
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.