I'm using Python 2.7 and my data looks like this:
import pandas as pd
df = pd.DataFrame({ 'DateVar' : ['9/1/2013', '10/1/2013', '2/1/2014'],
'Field' : 'foo' })
I want to parse DateVar to create 2 new fields: a 'month' field and a 'year' field.
I was able to tokenize 'DateVar' via vectorized string method:
df.DateVar.str.split('/')
This is a little closer to what I want, so then I next tried to slice the months [9, 10, 2] using the following code:
df.DateVar.str.split('/')[0]
But unexpectedly, I'm getting:
['9', '1', '2013']
So how can I get a vector of all the months?
If you only need one column, you can use:
df.DateVar.str.split("/").str[0]
If you need the month and day column, use str.extract
:
import pandas as pd
df = pd.DataFrame({ 'DateVar' : ['9/1/2013', '10/1/2013', '2/1/2014'],
'Field' : 'foo' })
print df.DateVar.str.extract(r"(?P<month>\d+)/(?P<day>\d+)/\d+").astype(int)
the output:
month day
0 9 1
1 10 1
2 2 1
It is because
>>> df.DateVar.str.split('/')
0 [9, 1, 2013]
1 [10, 1, 2013]
2 [2, 1, 2014]
so
>>> df.DateVar.str.split('/')[0]
['9', '1', '2013']
v = [x[0] for x in df.DateVar.str.split('/')]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.