[英]Unable to parse two fields from a webpage using requests module
I'm trying to scrape two fields product_title
and item_code
from this webpage using requests module.我正在尝试使用请求模块从此网页中抓取两个字段
product_title
和item_code
。 When I execute the script below, I always get AttributeError
in place of the result as the data I'm after are not in page source.当我执行下面的脚本时,我总是得到
AttributeError
代替结果,因为我所追求的数据不在页面源中。
However, I've come across several solutions in here which are able to fetch data from javascript encrypted sites even when the data are not in page source, so I suppose there should be any way to grab the two fields from the webpage using requests.但是,我在这里遇到了几种解决方案,即使数据不在页面源中,它们也能够从 javascript 加密站点获取数据,所以我想应该有任何方法可以使用请求从网页中获取这两个字段。
import requests
from bs4 import BeautifulSoup
link = 'https://www.sainsburys.co.uk/gol-ui/Product/persil-small---mighty-non-bio-laundry-liquid-21l-60-washes'
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
res = s.get(link)
soup = BeautifulSoup(res.text,"lxml")
product_title = soup.select_one("h1[data-test-id='pd-product-title']").get_text(strip=True)
item_code = soup.select_one("span#productSKU").get_text(strip=True)
print(product_title,item_code)
Expected output:预期输出:
Persil Non-Bio Laundry Liquid 1.43L
Item code: 7637944
How can I fetch the two fields from that site using requests?
如何使用请求从该站点获取两个字段?
Actually the wesite calling apis, so you can use that directly to get the data其实是wesite调用apis,所以你可以直接使用它来获取数据
r = requests.get('https://www.sainsburys.co.uk/groceries-api/gol-services/product/v1/product?filter[product_seo_url]=gb%2Fgroceries%2Fpersil-small---mighty-non-bio-laundry-liquid-21l-60-washes&include[ASSOCIATIONS]=true&include[PRODUCT_AD]=citrus')
products = r.json()['products']
for each_product in products:
print(f"Item code: {each_product['product_uid']}")
print(each_product['name'])
# Item code: 7637944
# Persil Non-Bio Laundry Liquid 1.43L
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.