Relevant part of the DOM: Screenshot of the DOM
This is the code I wrote:
from bs4 import BeautifulSoup
import requests
URL = 'https://www.cheapflights.com.sg/flight-search/SIN-KUL/2022-06-04?sort=bestflight_a&attempt=3&lastms=1653844067064'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
flight = soup.find('div', class_= 'resultWrapper')
print(flight)
The result that I get whenever print(flight) is executed is always None. I have tried changing to div tags with different class names but it still always returns None. The soup seems to be fine though because when I execute print(soup) it returns a text version of the DOM so the problem seems to be with the next line
Any suggestions on how I can get something other than None? Thank you!
That's because of the User-Agent. If I try to curl the page without changing the default User-Agent, it'll return this page.
Change your code like this, to avoid that your program gets detected:
headers = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) ..."
}
page = requests.get(URL, headers=headers)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.