For eg: Art Direction: Eve Stewart; Set Decoration: Ev... Art Direction: Luciana Arrighi; Set Decoration... Art Direction: Rick Heinrichs; Set Decoration:
I want to extract the 2nd element in the above string, "Eve Stewart" and create a separate dataframe column as "Art Directors".
#Art Direction: Eve Stewart; Set Decoration: Ev...
import re
art=[ ]
for row in before_2000["art_directors"]:
found = re.search("Art Direction:(.+); Set Decoration", row)
art.append(found)
Try the following code:
import re
import pandas as pd
string = "Art Direction: Eve Stewart; Set Decoration: Ev... Art
Direction: Luciana Arrighi; Set Decoration... Art Direction: Rick
Heinrichs;"
pattern = "Art Direction:(.*?);"
art_directors = re.findall(pattern, string)
art_directors = [x.strip() for x in art_directors]
df = pd.DataFrame({"Art Directors": art_directors})
First you define the string data, then you define the regex pattern you're searching for. Finally, re.findall(pattern, string) finds all matches for that pattern within the string data. art_directors is a list and you can then reformat it into a pandas dataframe. I stripped the whitespace surrounding the cell values as well.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.