簡體   English   中英

如何使用 Beautiful Soup 從網站獲取值和項目名稱

[英]How to get values and item name from website using Beautiful Soup

這里有一個基本的 bs4 問題,但已經嘗試了幾個小時!

url = 'https://www.currys.co.uk/gbuk/search-keywords/xx_xx_xx_xx_xx/acer/xx-criteria.html'
r = urllib.request.urlopen(url).read()
soup = BeautifulSoup(r,'lxml')

price = soup.find_all("div", class_="productPrices")

我現在如何去消費價格? 在這種情況下,就是“strong class="price" data-product="price"> 標簽。

我還希望能夠消費產品 SKU:"productSKU":"200341"

我希望能夠遍歷與我的搜索匹配的所有頁面(在這種情況下,只是“acer”)並將與該搜索匹配的所有 skus 和價格存儲為數據框。

你可以試試這個:

import requests
import re
from collections import namedtuple
product = namedtuple('product', ['name', 'price', 'sku'])
from bs4 import BeautifulSoup as soup
page_data = str(requests.get('https://www.currys.co.uk/gbuk/search-keywords/xx_xx_xx_xx_xx/acer/xx-criteria.html').text)
names = [i.text for i in soup(page_data, 'html.parser').find_all('span', {'data-product':'name'})]
prices = list(map(lambda x:re.sub('[\s\n]+', '', x), [i.text for i in soup(page_data, 'html.parser').find_all('strong', {'data-product':'price'})]))
skus = dict([(b[:-1], a) for a, b in re.findall('"productSKU":"(.*?)","productName":"(.*?)"', page_data)])
final_product_data = [product(a, b, int([h for c, h in skus.items() if c in a][0])) for a, b in zip(names, list(prices))]
print([(i.name, i.price, i.sku) for i in final_product_data])

輸出:

[('KG221Q Full HD 21.5" LED Monitor - Black', '£99.99', '201795'), ('C22-760 21.5" All-in-One PC - Silver', '£399.97', '200341'), ('KG271 Full HD 27" LED Gaming Monitor - Black', '£179.99', '201797'), ('S242HLDBID Full HD 24" LED Monitor', '£119.99', '156512'), ('CB3-431 14" Full HD Chromebook - Silver', '£299.99', '169493'), ('CB3-431 14" Full HD Chromebook - Gold', '£299.99', '169493'), ('Iconia One 10 B3-A40 10.1" Tablet - 16 GB, White', '£139.99', '214589'), ('14 CB3-431 Chromebook - Silver', '£249.99', '183981'), ('ED242QRwi Full HD 24" Curved LCD Monitor - White', '£119.99', '224620'), ('Aspire E15 15.6" Laptop - Black', '£699.99', '204284'), ('11 CB3-131 Chromebook - White', '£199.97', '165016'), ('R241Ybmid Full HD 23.8" LED Monitor', '£134.99', '164002'), ('CB3-131 11.6" Chromebook - Blue', '£199.99', '214340'), ('15 CB3-532 Full HD Chromebook - Iron', '£279.99', '191983'), ('Chromebook R 13 CB5-312T 2-in-1 - Silver', '£399.99', '180082'), ('14 CB3-431 Chromebook - Gold', '£249.99', '183980'), ('Iconia One 10 B3-A40 10.1" Tablet - 32 GB, Black', '£149.99', '214589'), ('Swift 3 SF314-52 14" Laptop - Silver', '£649.99', '205493'), ('C24-760 23.8" All-in-One PC - Silver', '£599.99', '200448'), ('Chromebook R 11 CB5-132T 2-in-1 - White', '£279.99', '183985')]

現在,您的數據存儲為一個namedtuple對象列表,以便於訪問。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM