Let's say I have the following dataframe series df['Name'] column:
Name
'Jerry'
'Adam (and family)'
'Paul and Hellen (and family):\n'
'John and Peter (and family):/n'
How would I remove all the contents in Name after the first parentheses?
df['Name']= df['Name'].str.split("'(").str[0]
doesn't seem to work and I don't understand why?
The output I want is
Name
'Jerry'
'Adam'
'Paul and Hellen'
'John and Peter'
so everything after the parentheses is deleted.
Solution with split
- is necessary escape (
by \\
:
df['Name']= df['Name'].str.split("\s+\(").str[0]
print (df)
Name
0 'Jerry'
1 'Adam
2 'Paul and Hellen
3 'John and Peter
Solution with regex
and replace
:
df['Name']= df['Name'].str.replace("\s+\(.*$", "")
print (df)
Name
0 'Jerry'
1 'Adam
2 'Paul and Hellen
3 'John and Peter
\\s+\\(.*$
means replace from optional whitespace
, first (
to the end of string $
to ""
- empty string.
Use regular expression:
>>> import re
>>> str = 'Adam (and family)'
>>> result = re.sub(r"( \().*$", '', str)
>>> print result
Adam
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.