Using str.split for pandas dataframe values based on parentheses location

Question

Let's say I have the following dataframe series df['Name'] column:

         Name
       'Jerry'
  'Adam (and family)'
'Paul and Hellen (and family):\n'
'John and Peter (and family):/n'

How would I remove all the contents in Name after the first parentheses?

df['Name']= df['Name'].str.split("'(").str[0]

doesn't seem to work and I don't understand why?

The output I want is

         Name
       'Jerry'
        'Adam'
    'Paul and Hellen'
    'John and Peter'

so everything after the parentheses is deleted.

Answer 1

Solution with split - is necessary escape ( by \\ :

df['Name']= df['Name'].str.split("\s+\(").str[0]
print (df)
               Name
0           'Jerry'
1             'Adam
2  'Paul and Hellen
3   'John and Peter

Solution with regex and replace :

df['Name']= df['Name'].str.replace("\s+\(.*$", "")
print (df)
               Name
0           'Jerry'
1             'Adam
2  'Paul and Hellen
3   'John and Peter

\\s+\\(.*$ means replace from optional whitespace , first ( to the end of string $ to "" - empty string.

Answer 2

Use regular expression:

>>> import re
>>> str = 'Adam (and family)'
>>> result = re.sub(r"( \().*$", '', str)
>>> print result
Adam

Using str.split for pandas dataframe values based on parentheses location

Question

2 answers

solution1
2 ACCPTED 2017-02-13 14:04:22

solution2
0 2017-02-13 13:54:57

Using str.split for pandas dataframe values based on parentheses location

Question

2 answers

solution1 2 ACCPTED 2017-02-13 14:04:22

solution2 0 2017-02-13 13:54:57

solution1
2 ACCPTED 2017-02-13 14:04:22

solution2
0 2017-02-13 13:54:57