(Python, Pandas) - How do I get everything to the left of a certain character?

Question

I have a column, market_area that I want to abbreviate by keeping only the part of the string to the left of the hyphen.

For example, my data is like this:

import pandas as pd
tmp = pd.DataFrame({'market_area': ['San Francisco-Oakland-San Jose',
                                    None, 
                                    'Dallas-Fort Worth', 
                                    'Los Angeles-Riverside-Orange County'],
                    'val': [1,2,3,4]})

My desired output would be:

['San Francisco', None, 'Dallas', 'Los Angeles']

I am able to split based on the hyphen:

tmp['market_area'].str.split('-')

But how do I extract only the part to the left of the hyphen?

Answer 1

You can extract the first element in the splitted list using .str[0] :

tmp.market_area.str.split('-').str[0]
Out[3]:
0    San Francisco
1             None
2           Dallas
3      Los Angeles
Name: market_area, dtype: object

Or use str.extract method with regex ^([^-]*).* , which captures the pattern until the first - :

tmp.market_area.str.extract('^([^-]*).*', expand=False)
Out[5]:
0    San Francisco
1              NaN
2           Dallas
3      Los Angeles
Name: market_area, dtype: object

(Python, Pandas) - How do I get everything to the left of a certain character?

Question

1 answers

solution1
3 ACCPTED 2017-11-17 20:46:18

(Python, Pandas) - How do I get everything to the left of a certain character?

Question

1 answers

solution1 3 ACCPTED 2017-11-17 20:46:18

solution1
3 ACCPTED 2017-11-17 20:46:18