简体   繁体   中英

Extracting string between 2 characters from Dataframe column

I have a column with entries like: Hello [World]. I am trying to extract 'World' and make a new column with that, and doing this for every row.

Not sure how to go about this, I am not familar with Regex.

Thanks.

It would look something like this:

import pandas as pd

df = pd.DataFrame([['hello [world]'],['something [else]']], columns=['words']);
df['words'] = df['words'].str.replace('^.*\[|\]$','')

print(df)

The only complicated part there is that regex: replace('.*\[|\]$','') . That says to look for the start of the word ^ up to .* the first instance of [ character OR | from the first instance of ] character that is at the end of the string $ and replace that with nothing ''

If you are going to be doing this kind of thing often, I would highly encourage you to learn regex.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM