简体   繁体   中英

Python pandas dataframe regex to extract substring from object

I created a dataframe in python using pandas module from a csv file. Pandas by default converted string into object type. Now from that string, I wanted to create another column which I am trying to create using regex. However, because the column is object I am getting error

data = pd.read_csv(r'Desktop\train.csv')
desig = re.search(r'(\w+), (\w+). (\w+)',data['Name']).group(1)

TypeError: expected string or buffer

How can I extract the portion from the object?

Thanks.

You want to use the vectorised operations contained in the str methods of the dataframe:

data['desig'] = data['Name'].str.extract(r'(\w+), (\w+). (\w+)')

This will actually return a dataframe with three columns corresponding to the three groups.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM