简体   繁体   中英

How to extract a part of a sentence or string in Python using regular expression and strip methods

For eg: Art Direction: Eve Stewart; Set Decoration: Ev... Art Direction: Luciana Arrighi; Set Decoration... Art Direction: Rick Heinrichs; Set Decoration:

I want to extract the 2nd element in the above string, "Eve Stewart" and create a separate dataframe column as "Art Directors".

#Art Direction: Eve Stewart; Set Decoration: Ev...
import re
art=[ ]

for row in before_2000["art_directors"]:
    found = re.search("Art Direction:(.+); Set Decoration", row)
    art.append(found)

Try the following code:

import re    
import pandas as pd

string = "Art Direction: Eve Stewart; Set Decoration: Ev... Art 
Direction: Luciana Arrighi; Set Decoration... Art Direction: Rick 
Heinrichs;"

pattern = "Art Direction:(.*?);"

art_directors = re.findall(pattern, string)
art_directors = [x.strip() for x in art_directors]

df = pd.DataFrame({"Art Directors": art_directors})

First you define the string data, then you define the regex pattern you're searching for. Finally, re.findall(pattern, string) finds all matches for that pattern within the string data. art_directors is a list and you can then reformat it into a pandas dataframe. I stripped the whitespace surrounding the cell values as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM