简体   繁体   中英

pandas datetime columns problem and i don't know what i am missing

I am a Korean student

Please understand that English is awkward

i want to make columns datetime > year,mounth.... ,second

train = pd.read_csv('input/Train.csv')

DateTime looks like this (this is head(20) and I remove other columns easy to see)

    datetime

0   2011-01-01 00:00:00
1   2011-01-01 01:00:00 
2   2011-01-01 02:00:00
3   2011-01-01 03:00:00
4   2011-01-01 04:00:00
5   2011-01-01 05:00:00
6   2011-01-01 06:00:00 
7   2011-01-01 07:00:00
8   2011-01-01 08:00:00
9   2011-01-01 09:00:00
10  2011-01-01 10:00:00
11  2011-01-01 11:00:00
12  2011-01-01 12:00:00
13  2011-01-01 13:00:00
14  2011-01-01 14:00:00
15  2011-01-01 15:00:00
16  2011-01-01 16:00:00
17  2011-01-01 17:00:00
18  2011-01-01 18:00:00
19  2011-01-01 19:00:00

then I write this code to see each columns (year,month,day,hour,minute,second)

train['year'] = train['datetime'].dt.year

train['month'] = train['datetime'].dt.month

train['day'] = train['datetime'].dt.day

train['hour'] = train['datetime'].dt.hour

train['minute'] = train['datetime'].dt.minute

train['second'] = train['datetime'].dt.seond

and error like this

AttributeError: Can only use.dt accessor with datetimelike values

please help me ㅠㅅㅠ

Note that by default read_csv is able to deduce column type only for numeric and boolean columns. Unless explicitely specified (eg passing converters or dtype parameters), all other cases of input are left as strings and the pandasonic type of such columns is object .

And just this occurred in your case. So, as this column is of object type, you can not invoke dt accessor on it, as it works only on columns of datetime type.

Actually, in this case, you can take the following approach:

  • do not specify any conversion of this column (it will be parsed just as object ),
  • after that split datetime column into "parts", using str.split (all 6 columns with a single instruction),
  • set proper column names in the resulting DataFrame,
  • join it to the original DataFrame (then drop),
  • as late as now change the type of the original column.

To do it, you can run:

wrk = df['datetime'].str.split(r'[- :]', expand=True).astype(int)
wrk.columns = ['year', 'month', 'day', 'hour', 'minute', 'second']
df = df.join(wrk)
del wrk
df['datetime'] = pd.to_datetime(df['datetime'])

Note that I added astype(int) . Otherwise these columns would be left as object (actually string ) type.

Or maybe this original column is not needed any more (as you have extracted all date / time components)? In such case drop this column instead of converting it.

And the last hint: datetime is used rather as a type name (with various endings). So it is better when you used some other name here, at least differing in char case, eg DateTime .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM