简体   繁体   中英

How to remove part of string ahead of special character in a column in Pandas?

I have this simple dataframe:

In [101]: df = pd.DataFrame({'a':[1,2,3],'b':['ciao','hotel',"l'hotel"]})

In [102]: df
Out[102]: 
   a           b
0  1        ciao
1  2       hotel
2  3     l'hotel

The goal here is to remove the part of the strings ahead the ' apostrophe, so that df:

   a           b
0  1        ciao
1  2       hotel
2  3       hotel

So far I tried to split the string with sep=("'") and get the second element only, but it failed since I have strings (and therefore lists) with different length:

df['c'] = df['b'].apply(lambda x: x.split("'")[1])

You can use -1 to always get the last part rather than the second part.

df['c'] = df['b'].apply(lambda x: x.split("'")[-1])

print(df)

#    a        b      c
# 0  1     ciao   ciao
# 1  2    hotel  hotel
# 2  3  l'hotel  hotel 

However, keep in mind that this will brake if you have have strings with 2 or more apostrophes (but your requirement doesn't specify what to do in these cases anyway).

Use str.split and select last list by -1 :

df['c'] = df['b'].str.split("'").str[-1]
print (df)
   a        b      c
0  1     ciao   ciao
1  2    hotel  hotel
2  3  l'hotel  hotel

Or use str.replace :

df['c'] = df['b'].str.replace("(.*)'", '')
print (df)
   a        b      c
0  1     ciao   ciao
1  2    hotel  hotel
2  3  l'hotel  hotel

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM