簡體   English   中英

使用漂亮的湯刮取數據

[英]Scraping data using beautiful soup

我對 web 抓取/python 相當陌生,並且有一段代碼需要一些幫助。 我一輩子都看不到我哪里出了問題。 一如既往的幫助非常感謝。 如果你有時間,請指出我哪里出錯了!

import requests
from bs4 import BeautifulSoup
import csv

page = requests.get("https://www.bathroomspareparts.co.uk/merlyn-two-panel-hinged-bath-screen-mb7-spare-parts-17909-c.asp")
soup = BeautifulSoup(page.content, 'html.parser')


all_products = []


products = soup.select('div.row cf')
for product in products:
    name = product.select('div.item-short-description')[0].text.strip() 
    price = product.select('div.item-price')[0].text.strip() 

    all_products.append({
        "Name": name,
        "Price": price,
    })

有些產品沒有價格,因此您必須檢查:

import csv
import requests
from bs4 import BeautifulSoup

page = requests.get(
    "https://www.bathroomspareparts.co.uk/merlyn-two-panel-hinged-bath-screen-mb7-spare-parts-17909-c.asp"
)
soup = BeautifulSoup(page.content, "html.parser")

all_products = []

products = soup.select("#products section")
for product in products:
    name = product.select_one(".item-short-description").text.strip()

    price = product.select_one(".item-price")
    price = price.text.strip() if price else "N/A"

    all_products.append(
        {
            "Name": name,
            "Price": price,
        }
    )

print(all_products)

印刷:

[
    {"Name": "Merlyn Bath Screen Cover Caps M7012", "Price": "£1.24"},
    {"Name": "Merlyn Luna Rail End Cap LH SP0M7010/L", "Price": "£1.81"},
    {"Name": "Merlyn Luna Rail End Cap RH SP0M7010/R", "Price": "£1.81"},
    {"Name": "Merlyn Luna Rail SP0M7005", "Price": "£6.54"},
    {
        "Name": "Merlyn Two Panel Hinged Bath Screen MB7 Spare Parts",
        "Price": "N/A",
    },
]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM