简体   繁体   中英

Pandas - Extract a string starting with a particular character

It should be fairly simple yet I'm not able to achieve it.

I have a dataframe df1, having a column "name_str". Example below:

   name_str 
0    alp:ha
1    bra:vo
2  charl:ie

I have to create another column that would comprise - say 5 characters - that start after the colon (:). I've written the following code:

import pandas as pd

data = {'name_str':["alp:ha", "bra:vo", "charl:ie"]}
#indx = ["name_1",]
df1 = pd.DataFrame(data=data)
n= df1['name_str'].str.find(":")+1
df1['slize'] = df1['name_str'].str.slice(n,2)
print(df1)

But the output is disappointing: NaanN

   name_str  slize
0    alp:ha    NaN
1    bra:vo    NaN
2  charl:ie    NaN

The output should've been:

   name_str  slize
0    alp:ha    ha
1    bra:vo    vo
2  charl:ie    ie

Would anyone please help? Appreciate it.

You can use str.extract to extract everything after the colon with this regular expression: :(.*)

df1['slize'] = df1.name_str.str.extract(':(.*)')                                                  

>>> df1                                                                                                
   name_str slize
0    alp:ha    ha
1    bra:vo    vo
2  charl:ie    ie

Edit, based on your updated question

If you'd like to extract up to 5 characters after the colon, then you can use this modification:

df['slize'] = df1.name_str.str.extract(':(.{,5})') 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM