简体   繁体   中英

Python dataframe remove substring before specific character if ture

I am trying to remove the numbers before "-" in the name column. But not all rows have numbers before the name. How do I remove the numbers in rows that have numbers and keep the rows that don't have numbers in front untouched?

Sample df:

country     Name
UK          5413-Marcus
Russia      5841-Natasha
Hong Kong   Keith
China       7777-Wang

Desired df

country     Name
UK          Marcus
Russia      Natasha
Hong Kong   Keith
China       Wang

I appreciate any assistance! Thanks in advance!

Pandas has string accessors for series. If you split and get the last element of the resulting list, even if a row does not have the delimeter '-' you still want the last element of that one-element list.

df.Name = df.Name.str.split('-').str.get(-1)

You might use str.lstrip for that task following way:

import pandas as pd
df = pd.DataFrame({'country':['UK','Russia','Hong Kong','China'],'Name':['5413-Marcus','5841-Natasha','Keith','7777-Wang']})
df['Name'] = df['Name'].str.lstrip('-0123456789')
print(df)

Output:

     country     Name
0         UK   Marcus
1     Russia  Natasha
2  Hong Kong    Keith
3      China     Wang

.lstrip does remove leading characters, .rstrip trailing characters and .strip both.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM