简体   繁体   中英

Pandas - 'cut' everything after a certain character in a string column and paste it in the beginning of the column

In a pandas dataframe string column, I want to grab everything after a certain character and place it in the beginning of the column while stripping the character. What is the most efficient way to do this / clean way to do achieve this?

Input Dataframe:

>>> df = pd.DataFrame({'city':['Bristol, City of', 'Newcastle, City of', 'London']})
>>> df
                 city
0    Bristol, City of
1  Newcastle, City of
2              London
>>>

My desired dataframe output:

                city
0    City of Bristol
1  City of Newcastle
2             London

Assuming there are only two pieces to each string at most, you can split, reverse, and join:

df.city.str.split(', ').str[::-1].str.join(' ')

0      City of Bristol
1    City of Newcastle
2               London
Name: city, dtype: object

If there are more than two commas, split on the first one only:

df.city.str.split(', ', 1).str[::-1].str.join(' ')

0      City of Bristol
1    City of Newcastle
2               London
Name: city, dtype: object

Another option is str.partition :

u = df.city.str.partition(', ')
u.iloc[:,-1] + ' ' + u.iloc[:,0]

0      City of Bristol
1    City of Newcastle
2               London
dtype: object

This always splits on the first comma only.


You can also use a list comprehension, if you need performance:

df.assign(city=[' '.join(s.split(', ', 1)[::-1]) for s in df['city']])

                city
0    City of Bristol
1  City of Newcastle
2             London

Why should you care about loopy solutions? For loops are fast when working with string/regex functions (faster than pandas, at least). You can read more at For loops with pandas - When should I care? .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM