[英]BeautifulSoup is returning empty data from website
我正在运行这段代码
import requests
from bs4 import BeautifulSoup
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'
}
r = requests.get('https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2')
soup = BeautifulSoup(r.content, 'lxml')
print(soup.find('div', class_='price').text)
我正在尝试在此站点上获取产品的价格: https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2在运行我的代码时,我得到的只是空数据。 我做错了什么还是网站做了什么特别的事情来阻止我哄抬价格?
如评论中所述,您可以从商店的 API 获取产品数据。
就是这样:
import requests
from bs4 import BeautifulSoup
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
'AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/78.0.3904.108 Safari/537.36',
"x-requested-with": "XMLHttpRequest"
}
product_url = "https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2"
page_content = requests.get(product_url).content
soup = BeautifulSoup(page_content, 'lxml')
product_id = soup.find("input", {"name": "d-session-product"})["value"]
payload = {
"debug": "off",
"ajax": "1",
"product_list": product_id,
"action": "init",
"showStockStatusAndShoppingcart": "1",
"enablePickupAtNearbyStores": "yes",
}
endpoint = "https://www.bohus.no/lite.cgi/module/priceAndStock"
product_data = requests.post(endpoint, data=payload, headers=headers).json()
print(product_data["price"][0]["salesPriceNormal"])
Output:
8799
此外,您可以使用requests-HTML库:
import requests_html
from requests_html import HTMLSession
session = HTMLSession()
r = session.get('https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2')
r.html.render()
sel='div.price-data > div.price'
print(r.html.find(sel, first=True).text)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.