I am trying to find all the comments in a web page.
import requests
with requests.session() as r:
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:88.0) Gecko/20100101 Firefox/88.0'}
r = requests.get('https://www.example.com', verify=False, headers=headers)
print(r)
This script returns all the source code of the page. However, I am only interested in finding the commented lines. Can anyone help with me with a regular expression to find the commented lines. Or is there a better method to finding this?
You might try BeautifulSoup4
, which has built-in function for identifying comments.
Here's one StackOverflow that demonstrates this: How to find all comments with Beautiful Soup
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.