简体   繁体   中英

Pandas: Join Date and Time into one datetime column

I have a dataframe with 13 columns, where two columns are "Date" and "Time".

I would like to 1) join "Date" and "Time" and 2) format the result correctly (DD.MM.YYYY HH:MM:SS)

        Date    Time
0   2012-09-27  00:40:20
1   2012-09-27  07:24:37
2   2012-09-27  07:25:08
3   2012-09-27  07:43:48
4   2012-09-27  07:44:27
5   2012-09-27  07:55:16
6   2012-09-27  08:04:19
7   2012-09-27  08:08:21
8   2012-09-27  08:09:08
9   2012-09-27  08:15:30
10  2012-09-27  08:15:35

So I try:

>type(df["Date"])
pandas.core.series.Series

And I think to myself "Why not convert them both in to string representations and manipulate them that way?"

>g = df["Time"].astype(str)
>type(g)
pandas.core.series.Series

"What?! It's still a Series?!"

So I try to be creative:

>f = df["Date"].astype(str) + " " + df["Time"].astype(str)
>f
0     2012-09-27 00:40:20
1     2012-09-27 07:24:37
2     2012-09-27 07:25:08
3     2012-09-27 07:43:48

"Hey! That works! No let's just get the format right (DD.MM.YYYY HH:MM:SS)"

>#dt.strptime(f, '%Y-%m-%d %H:%M:%S')
TypeError: strptime() argument 1 must be str, not Series

"Still a Series...hmmm....Obviously it won't let itself be converted. Let's try something else"

>f = p.concat([df["Date"], df["Time"]], axis=1)
>f
        Date    Time
0   2012-09-27  00:40:20
1   2012-09-27  07:24:37
2   2012-09-27  07:25:08

"Nice! But.... Isn't this TWO columns still...?"

>f.shape
(100,2)

"Right...back to square one...."

I'm out of ideas....

Anyone?

the columns can be joined and assigned to a new or existing column in the dataframe:

df['datetime'] = df['dates'] + " " + df['time']

outputs example:

        dates      time             datetime
0  2012-09-27  00:40:20  2012-09-27 00:40:20
1  2012-09-27  07:24:37  2012-09-27 07:24:37
2  2012-09-27  07:25:08  2012-09-27 07:25:08

I think you need join columns with to_datetime :

print (type(df.loc[0, 'Date']))
<class 'str'>
print (type(df.loc[0, 'Time']))
<class 'str'>

df['datetime'] = pd.to_datetime(df['Date'] + " " + df['Time'])

If there is datetime with string s:

print (type(df.loc[0, 'Date']))
<class 'pandas._libs.tslib.Timestamp'>
print (type(df.loc[0, 'Time']))
<class 'str'>

df['datetime'] = df['Date']  + pd.to_timedelta(df['Time'])

If there is datetime with time s:

print (type(df.loc[0, 'Date']))
<class 'pandas._libs.tslib.Timestamp'>

print (type(df.loc[0, 'Time']))
<class 'datetime.time'>

df['datetime'] = df['Date']  + pd.to_timedelta(df['Time'].astype(str))

And if there are dates with times:

print (type(df.loc[0, 'Date']))
<class 'datetime.date'>

print (type(df.loc[0, 'Time']))
<class 'datetime.time'>

df['datetime'] = pd.to_datetime(df["Date"].astype(str) + " " + df["Time"].astype(str))

print (df)
          Date      Time            datetime
0   2012-09-27  00:40:20 2012-09-27 00:40:20
1   2012-09-27  07:24:37 2012-09-27 07:24:37
2   2012-09-27  07:25:08 2012-09-27 07:25:08
3   2012-09-27  07:43:48 2012-09-27 07:43:48
4   2012-09-27  07:44:27 2012-09-27 07:44:27
5   2012-09-27  07:55:16 2012-09-27 07:55:16
6   2012-09-27  08:04:19 2012-09-27 08:04:19
7   2012-09-27  08:08:21 2012-09-27 08:08:21
8   2012-09-27  08:09:08 2012-09-27 08:09:08
9   2012-09-27  08:15:30 2012-09-27 08:15:30
10  2012-09-27  08:15:35 2012-09-27 08:15:35

.astype(str) works on the elements of Series, not the Series itself, so of course type(df["Time"].astype(str)) == pd.Series . This seems to be the source of much of your confusion, you're acting on the Series, not its elements.

A solution (there maybe an easier way) is to just loop over the series:

dts = [datetime.datetime.strptime(elem, '%Y-%m-%d%H:%M:%S') 
       for elem in df['Date'] + df['Time']]

fmted = [elem.strftime('%d-%m-%Y %H:%M:%S') for elem in dts]

df.insert(0, 'DateTime', fmted)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM