I have a dataframe column which contains URLs. I want to extract a specific string out of this URL by using a regex pattern for each row. Here is an example of the URL string:
'www.abcdef.com/sports-bra-sports-bra-black-abcde1f02-c11.html',
As a column is a series and I need to iterate through, I've tried the following codes:
1.
for i in df['landing_screen_name']:
regex = i.str.extract(r'.{0,13}.html')
print(regex)
break
2.
for idx, row in df.iterrows():
a = row['landing_screen_name'].str.contains(r'.{0,13}.html')
print(a)
break
however for both I got the following error:
AttributeError: 'str' object has no attribute 'str'
I've tried everything but couldn't find the issue yet, could you please help me with this?
Try this:
df['landing_screen_name'] = df['landing_screen_name'].str.extract(r'(.{0,13}\.html)')
You should operate at the column level, eg use:
print(df['landing_screen_name'].str.extract(r'.{0,13}.html').to_string(index=False))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.