繁体   English   中英

BeautifulSoup 从网站返回空数据

[英]BeautifulSoup is returning empty data from website

我正在运行这段代码

import requests
from bs4 import BeautifulSoup


headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36'

}

r = requests.get('https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2')

soup = BeautifulSoup(r.content, 'lxml')


print(soup.find('div', class_='price').text)

我正在尝试在此站点上获取产品的价格: https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2在运行我的代码时,我得到的只是空数据。 我做错了什么还是网站做了什么特别的事情来阻止我哄抬价格?

如评论中所述,您可以从商店的 API 获取产品数据。

就是这样:

import requests
from bs4 import BeautifulSoup


headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
                  'AppleWebKit/537.36 (KHTML, like Gecko) '
                  'Chrome/78.0.3904.108 Safari/537.36',
    "x-requested-with": "XMLHttpRequest"

}

product_url = "https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2"
page_content = requests.get(product_url).content
soup = BeautifulSoup(page_content, 'lxml')
product_id = soup.find("input", {"name": "d-session-product"})["value"]

payload = {
    "debug": "off",
    "ajax": "1",
    "product_list": product_id,
    "action": "init",
    "showStockStatusAndShoppingcart": "1",
    "enablePickupAtNearbyStores": "yes",
}

endpoint = "https://www.bohus.no/lite.cgi/module/priceAndStock"
product_data = requests.post(endpoint, data=payload, headers=headers).json()
print(product_data["price"][0]["salesPriceNormal"])

Output:

8799

此外,您可以使用requests-HTML库:

import requests_html


from requests_html import HTMLSession
session = HTMLSession()


r = session.get('https://www.bohus.no/spiseplassen/oppbevaring-1/gradino-vitrine-2')
r.html.render()
sel='div.price-data > div.price'
print(r.html.find(sel, first=True).text)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM