简体   繁体   中英

(Python, Pandas) - How do I get everything to the left of a certain character?

I have a column, market_area that I want to abbreviate by keeping only the part of the string to the left of the hyphen.

For example, my data is like this:

import pandas as pd
tmp = pd.DataFrame({'market_area': ['San Francisco-Oakland-San Jose',
                                    None, 
                                    'Dallas-Fort Worth', 
                                    'Los Angeles-Riverside-Orange County'],
                    'val': [1,2,3,4]})

My desired output would be:

['San Francisco', None, 'Dallas', 'Los Angeles']

I am able to split based on the hyphen:

tmp['market_area'].str.split('-')

But how do I extract only the part to the left of the hyphen?

You can extract the first element in the splitted list using .str[0] :

tmp.market_area.str.split('-').str[0]
Out[3]:
0    San Francisco
1             None
2           Dallas
3      Los Angeles
Name: market_area, dtype: object

Or use str.extract method with regex ^([^-]*).* , which captures the pattern until the first - :

tmp.market_area.str.extract('^([^-]*).*', expand=False)
Out[5]:
0    San Francisco
1              NaN
2           Dallas
3      Los Angeles
Name: market_area, dtype: object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM