I am trying to split the Time column from my dataset. The Time column has a value like this '2324' instead of '23:24'. I have used this command df['MINUTES']=df['MINUTES'].str[1:3]. but it didn't work accurately, since the time column is based on 24 hours. So '2324' showing as '23:32' which is incorrect.How do I split them into proper way. Please be gentle I am just starting out in Python/DA field.
Thanks in advance! Anil
I am not sure where did the issue arise, since having 24 hrs time shouldn't affect the script. Here's an example that seems to match the expected output:
import pandas as pd
df = pd.DataFrame({'Example':['1242','1342','1532','1643','1758','1821','1902','0004','2324']})
df['Hour'] = df['Example'].str[:2]
df['Minute'] = df['Example'].str[2:]
df['Time'] = df['Example'].str[:2] + ":" + df['Example'].str[2:]
This generates the following output:
Example Hour Minute Time
0 1242 12 42 12:42
1 1342 13 42 13:42
2 1532 15 32 15:32
3 1643 16 43 16:43
4 1758 17 58 17:58
5 1821 18 21 18:21
6 1902 19 02 19:02
7 0004 00 04 00:04
8 2324 23 24 23:24
Here is what you can do:
df['MINUTES'].replace(['(?<=\d\d)(?=\d\d)'], ':', regex=True, inplace=True)
We are basically telling python to inset a colon ':'
in this gap: '(?<=\d\d)(?=\d\d)'
, which is between two digits on each side.
Lets test it:
import pandas as pd
df = pd.DataFrame({'MINUTES':['1234',
'7654',
'8766']})
df['MINUTES'].replace(['(?<=\d\d)(?=\d\d)'], ':',
regex=True,
inplace=True)
print(df)
Output:
MINUTES
0 12:34
1 76:54
2 87:66
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.