<a href='https://example.com/'>
references in a large file and append the
target='_blank' rel='noopener noreferrer'
option to the end of the tag, if it is missing.
Roughly, I did the following:
re.sub(r'<a href=([^>]+)', r'<a href=([^>]+)' + " target='_blank' rel='noopener noreferrer'", content)
Note: content contains the body of text to alter.
But, the second argument, which should be the value to replace is messing up the result.
The output I am getting is:
<a href=([^>]+) target='_blank' rel='noopener noreferrer'>
The expected result should be:
<a href='https://example.com/' target='_blank' rel='noopener noreferrer'>
What am I doing incorrectly, and how do I fix this issue?
Try this: (*** If coding professionally, use the tool ti7 suggested.)
import re
content = "<a href='https://example.com/'>"
x = re.sub(r'(<a href=([^>]+))', r'\1' + " target='_blank' rel='noopener noreferrer'", content)
print(x)
output:
<a href='https://example.com/' target='_blank' rel='noopener noreferrer'>
If you can use a 3rd-party library, BeautifulSoup may work very well for you!
https://www.crummy.com/software/BeautifulSoup/bs4/doc/
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_contents, "html.parser")
soup.find_all("a")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.