简体   繁体   中英

Parsing Pandas df Column of mixed data into Datetime

df = pd.DataFrame('23.Jan.2020 01.Mar.2017 5663:33 20.May.2021 626'.split())

给这个 df

I want to convert to date-like elements to datetime and for numbers, to return the original value.

I have tried

t=pd.to_datetime(df[0], format='%d.%b.%Y', errors='ignore')

which just returns to original df with no change. And I have tried to change errors to 'coerce', which does the conversion for date like elements, but numbers are dropped

t=pd.to_datetime(df[0], format='%d.%b.%Y', errors='coerce')

在此处输入图片说明

Then I attempt to return the original df value if NaT, else substitute with the new datetime from t

df.where(t.isnull(), other=t, axis=1)

Which works for returning the original df value where NaT, but it doesn't transfer the datetime

在此处输入图片说明

this will combine the two field types in the way you have specified:

import pandas as pd
df = pd.DataFrame('23.Jan.2020 01.Mar.2017 5663:33 20.May.2021 626'.split())
mod = pd.to_datetime(df[0], format='%d.%b.%Y', errors='coerce')

ndf = pd.concat([df, mod], axis=1)
ndf.columns = ['original', 'modified']
def funk(col1,col2):
    return col1 if pd.isnull(col2) else col2
    
ndf.apply(lambda x: funk(x.original,x.modified), axis=1)

# 0    2020-01-23 00:00:00
# 1    2017-03-01 00:00:00
# 2                5663:33
# 3    2021-05-20 00:00:00
# 4                    626

Maybe this is what you want?

dt = pd.Series('23.Jan.2020 01.Mar.2017 5663:33 20.May.2021 626'.split())
res = pd.to_datetime(dt, format="%d.%b.%Y", errors='coerce').fillna(dt)

This way the resulting elements in the series has the correct types:

>>> res.map(type)
0    <class 'pandas._libs.tslibs.timestamps.Timesta...
1    <class 'pandas._libs.tslibs.timestamps.Timesta...
2                                        <class 'str'>
3    <class 'pandas._libs.tslibs.timestamps.Timesta...
4                                        <class 'str'>
dtype: object

PS: I used a Series because it's easier to pass to to_datetime , and to Series.fillna .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM