pandas datetime columns problem and i don't know what i am missing

Question

I am a Korean student

Please understand that English is awkward

i want to make columns datetime > year,mounth.... ,second

train = pd.read_csv('input/Train.csv')

DateTime looks like this (this is head(20) and I remove other columns easy to see)

    datetime

0   2011-01-01 00:00:00
1   2011-01-01 01:00:00 
2   2011-01-01 02:00:00
3   2011-01-01 03:00:00
4   2011-01-01 04:00:00
5   2011-01-01 05:00:00
6   2011-01-01 06:00:00 
7   2011-01-01 07:00:00
8   2011-01-01 08:00:00
9   2011-01-01 09:00:00
10  2011-01-01 10:00:00
11  2011-01-01 11:00:00
12  2011-01-01 12:00:00
13  2011-01-01 13:00:00
14  2011-01-01 14:00:00
15  2011-01-01 15:00:00
16  2011-01-01 16:00:00
17  2011-01-01 17:00:00
18  2011-01-01 18:00:00
19  2011-01-01 19:00:00

then I write this code to see each columns (year,month,day,hour,minute,second)

train['year'] = train['datetime'].dt.year

train['month'] = train['datetime'].dt.month

train['day'] = train['datetime'].dt.day

train['hour'] = train['datetime'].dt.hour

train['minute'] = train['datetime'].dt.minute

train['second'] = train['datetime'].dt.seond

and error like this

AttributeError: Can only use.dt accessor with datetimelike values

please help me ㅠㅅㅠ

Answer 1

Note that by default read_csv is able to deduce column type only for numeric and boolean columns. Unless explicitely specified (eg passing converters or dtype parameters), all other cases of input are left as strings and the pandasonic type of such columns is object .

And just this occurred in your case. So, as this column is of object type, you can not invoke dt accessor on it, as it works only on columns of datetime type.

Actually, in this case, you can take the following approach:

do not specify any conversion of this column (it will be parsed just as object ),
after that split datetime column into "parts", using str.split (all 6 columns with a single instruction),
set proper column names in the resulting DataFrame,
join it to the original DataFrame (then drop),
as late as now change the type of the original column.

To do it, you can run:

wrk = df['datetime'].str.split(r'[- :]', expand=True).astype(int)
wrk.columns = ['year', 'month', 'day', 'hour', 'minute', 'second']
df = df.join(wrk)
del wrk
df['datetime'] = pd.to_datetime(df['datetime'])

Note that I added astype(int) . Otherwise these columns would be left as object (actually string ) type.

Or maybe this original column is not needed any more (as you have extracted all date / time components)? In such case drop this column instead of converting it.

And the last hint: datetime is used rather as a type name (with various endings). So it is better when you used some other name here, at least differing in char case, eg DateTime .

pandas datetime columns problem and i don't know what i am missing

Question

1 answers

solution1
0 2020-06-03 03:57:46

pandas datetime columns problem and i don't know what i am missing

Question

1 answers

solution1 0 2020-06-03 03:57:46

solution1
0 2020-06-03 03:57:46