繁体   English   中英

Python BeautifulSoup 在 CSV 中写入 1 行

[英]Python BeautifulSoup writing 1 line in CSV

我正在尝试获取页面上显示的产品名称、链接和价格的所有值。 每个占一行并用逗号分隔。

我已经编写了在类似站点上运行的这段代码,但出于某种原因,它只将第一个结果写入 CSV。

import requests
from bs4 import BeautifulSoup
from csv import writer

response = requests.get('https://www.micoca-cola.cl/bebidas/coca-cola')
soup = BeautifulSoup(response.text, 'html.parser')

items = soup.find_all(class_='prateleira vitrine n12colunas')

with open('coca.csv', 'w', newline='') as csv_file:
    csv_writer = writer(csv_file)
    headers = ['Producto', 'Link', 'Precio']
    csv_writer.writerow(headers)

    for item in items:
        producto = item.find(class_='product-block-name').get_text()
        link = item.find('a')['href']
        price = item.find(class_='bestPrice').get_text().replace('\n', '').replace('"', '').replace(' ', '')
        csv_writer.writerow([producto, link, price])

这给出了以下结果:Producto,Link,Precio "Refill 8 Coca-Cola Sin Azúcar reornable 2,0 lt. (No incluye envases)",https://www.micoca-cola.cl/refill-8-coca- cola-sin-azucar-retornable-20-lt-no-incluye-envases/p,"$9.520,00"

但是该页面上还有其他产品,我想在它们自己的行中包含。

缺少了什么?

要加载所有产品标题、链接和价格并保存为 CSV,您可以使用以下示例:

import re
import requests
import pandas as pd
from bs4 import BeautifulSoup


url = 'https://www.micoca-cola.cl/bebidas/coca-cola'
html_doc = requests.get(url).text
page_url = 'https://www.micoca-cola.cl' + re.search(r"\.load\('(.*?)'", html_doc).group(1)

data = []
page = 1
while True:
    soup = BeautifulSoup(requests.get(page_url + str(page)).content, 'html.parser')

    if not soup.body:
        break

    for product in soup.select('.product-group'):
        title = product.h4.text
        link = product.h4.a['href'] 
        print(title)
        print(link)
        price = product.find(class_="bestPrice")
        price = price.get_text(strip=True) if price else 'Out of Stock'
        print(price)
        print('-' * 80)

        data.append({
            'title': title,
            'link': link,
            'price': price
        })

    page += 1

df = pd.DataFrame(data)
print(df)
df.to_csv('data.csv', index=False)

印刷:

...
32                        Coca-Cola Light 6 x 591 ml.  ...    $ 5.340,00
33                       Coca-Cola Sin Azúcar 1,5 lt.  ...    $ 1.390,00
34                       Coca-Cola Sin Azúcar 2,5 lt.  ...    $ 1.890,00
35  Starter Kit Coca-Cola Light retornable 9 x 1,2...  ...   $ 10.710,00
36  Starter Kit Coca-Cola Original retornable 8 x ...  ...   $ 11.920,00
37                     Coca-Cola Original 6 x 3,0 lt.  ...   $ 13.140,00
38                Coca-Cola Energy Sin Azúcar 220 ml.  ...      $ 990,00
39  Starter Kit Coca-Cola Sin Azúcar retornable re...  ...    $ 1.490,00
40  Starter Kit Coca-Cola Sin Azúcar retornable 1,...  ...    $ 1.190,00
41                            Coca-Cola Light 2,5 lt.  ...    $ 1.890,00
42                            Coca-Cola Light 1,5 lt.  ...    $ 1.390,00
43                   Coca-Cola Sin Azúcar 6 x 250 ml.  ...    $ 2.290,00
44                         Coca-Cola Original 1,5 lt.  ...    $ 1.390,00
45                         Coca-Cola Original 3,0 lt.  ...    $ 2.190,00
46                     Coca-Cola Original 6 x 591 ml.  ...    $ 5.340,00
47  Starter Kit Coca-Cola Original retornable 9 x ...  ...   $ 10.710,00
48  Starter Kit Coca-Cola Light retornable 8 x 2,0...  ...   $ 11.920,00
49  Starter Kit Coca-Cola Original retornable 2,0 ...  ...    $ 1.490,00
50  Starter Kit Coca-Cola Light retornable retorna...  ...    $ 1.190,00
51                            Coca-Cola Light 3,0 lt.  ...    $ 2.190,00
52                     Coca-Cola Original 6 x 250 ml.  ...    $ 2.290,00
53  Starter Kit Coca-Cola Light retornable retorna...  ...    $ 1.490,00
54  Starter Kit Coca-Cola Original retornable 1,25...  ...    $ 1.190,00
55                         Coca-Cola Original 2,5 lt.  ...    $ 1.890,00
56                         Coca-Cola Original 1,0 lt.  ...      $ 990,00
57                            Coca-Cola Light 1,0 lt.  ...  Out of Stock

[58 rows x 3 columns]

并保存data.csv (来自 LibreOffice 的截图):

在此处输入图片说明

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM