Sorry if this is a basic question, I just started using the pandas module today. But basically I'm using it to clean up this csv file I'm working with. I'm trying to search for a specific string (or substring) in the second column and if I find it I wanted to add a new column to the dataset that will either contain a boolean value of true/false (for if I found the given string or not). Suggestions?
You can use Series.str.contains() method:
df['new'] = df.iloc[:, 1].str.contains(r'substring', flags=re.I)
Demo:
In [40]: import re
In [41]: df
Out[41]:
a b c
0 1 Anna 10
1 2 Barton 11
2 3 Max 12
In [42]: df['new'] = df.iloc[:, 1].str.contains(r'ma', flags=re.I)
In [43]: df
Out[43]:
a b c new
0 1 Anna 10 False
1 2 Barton 11 False
2 3 Max 12 True
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.