[英]Python BeautifulSoup writing 1 line in CSV
I'm trying to get the all the values of the product name, link and price shown on the page.我正在尝试获取页面上显示的产品名称、链接和价格的所有值。 Each taking up a row and separated by a comma.每个占一行并用逗号分隔。
I've written this code that works on a similar site, but for some reason here it only write the first result to the CSV.我已经编写了在类似站点上运行的这段代码,但出于某种原因,它只将第一个结果写入 CSV。
import requests
from bs4 import BeautifulSoup
from csv import writer
response = requests.get('https://www.micoca-cola.cl/bebidas/coca-cola')
soup = BeautifulSoup(response.text, 'html.parser')
items = soup.find_all(class_='prateleira vitrine n12colunas')
with open('coca.csv', 'w', newline='') as csv_file:
csv_writer = writer(csv_file)
headers = ['Producto', 'Link', 'Precio']
csv_writer.writerow(headers)
for item in items:
producto = item.find(class_='product-block-name').get_text()
link = item.find('a')['href']
price = item.find(class_='bestPrice').get_text().replace('\n', '').replace('"', '').replace(' ', '')
csv_writer.writerow([producto, link, price])
This gives the following result: Producto,Link,Precio "Refill 8 Coca-Cola Sin Azúcar retornable 2,0 lt. (No incluye envases)",https://www.micoca-cola.cl/refill-8-coca-cola-sin-azucar-retornable-20-lt-no-incluye-envases/p,"$9.520,00"这给出了以下结果:Producto,Link,Precio "Refill 8 Coca-Cola Sin Azúcar reornable 2,0 lt. (No incluye envases)",https://www.micoca-cola.cl/refill-8-coca- cola-sin-azucar-retornable-20-lt-no-incluye-envases/p,"$9.520,00"
But there are other products on that page that I want to include on their own lines.但是该页面上还有其他产品,我想在它们自己的行中包含。
What's missing?缺少了什么?
To load all product titles, links and prices and saving to CSV, you can use this example:要加载所有产品标题、链接和价格并保存为 CSV,您可以使用以下示例:
import re
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = 'https://www.micoca-cola.cl/bebidas/coca-cola'
html_doc = requests.get(url).text
page_url = 'https://www.micoca-cola.cl' + re.search(r"\.load\('(.*?)'", html_doc).group(1)
data = []
page = 1
while True:
soup = BeautifulSoup(requests.get(page_url + str(page)).content, 'html.parser')
if not soup.body:
break
for product in soup.select('.product-group'):
title = product.h4.text
link = product.h4.a['href']
print(title)
print(link)
price = product.find(class_="bestPrice")
price = price.get_text(strip=True) if price else 'Out of Stock'
print(price)
print('-' * 80)
data.append({
'title': title,
'link': link,
'price': price
})
page += 1
df = pd.DataFrame(data)
print(df)
df.to_csv('data.csv', index=False)
Prints:印刷:
...
32 Coca-Cola Light 6 x 591 ml. ... $ 5.340,00
33 Coca-Cola Sin Azúcar 1,5 lt. ... $ 1.390,00
34 Coca-Cola Sin Azúcar 2,5 lt. ... $ 1.890,00
35 Starter Kit Coca-Cola Light retornable 9 x 1,2... ... $ 10.710,00
36 Starter Kit Coca-Cola Original retornable 8 x ... ... $ 11.920,00
37 Coca-Cola Original 6 x 3,0 lt. ... $ 13.140,00
38 Coca-Cola Energy Sin Azúcar 220 ml. ... $ 990,00
39 Starter Kit Coca-Cola Sin Azúcar retornable re... ... $ 1.490,00
40 Starter Kit Coca-Cola Sin Azúcar retornable 1,... ... $ 1.190,00
41 Coca-Cola Light 2,5 lt. ... $ 1.890,00
42 Coca-Cola Light 1,5 lt. ... $ 1.390,00
43 Coca-Cola Sin Azúcar 6 x 250 ml. ... $ 2.290,00
44 Coca-Cola Original 1,5 lt. ... $ 1.390,00
45 Coca-Cola Original 3,0 lt. ... $ 2.190,00
46 Coca-Cola Original 6 x 591 ml. ... $ 5.340,00
47 Starter Kit Coca-Cola Original retornable 9 x ... ... $ 10.710,00
48 Starter Kit Coca-Cola Light retornable 8 x 2,0... ... $ 11.920,00
49 Starter Kit Coca-Cola Original retornable 2,0 ... ... $ 1.490,00
50 Starter Kit Coca-Cola Light retornable retorna... ... $ 1.190,00
51 Coca-Cola Light 3,0 lt. ... $ 2.190,00
52 Coca-Cola Original 6 x 250 ml. ... $ 2.290,00
53 Starter Kit Coca-Cola Light retornable retorna... ... $ 1.490,00
54 Starter Kit Coca-Cola Original retornable 1,25... ... $ 1.190,00
55 Coca-Cola Original 2,5 lt. ... $ 1.890,00
56 Coca-Cola Original 1,0 lt. ... $ 990,00
57 Coca-Cola Light 1,0 lt. ... Out of Stock
[58 rows x 3 columns]
And saves data.csv
(screenshot from LibreOffice):并保存data.csv
(来自 LibreOffice 的截图):
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.