简体   繁体   English

美汤 Web 刮痧 Python

[英]Beautiful Soup Web scraping Python

I have this code Html on a website:我在网站上有此代码 Html:

[![![enter image description here][1]][1] [![![在此处输入图片描述][1]][1]

This is my python script:这是我的 python 脚本:

import csv 
import requests
from urllib.request import urlopen
from bs4 import BeautifulSoup

csv_file = open('C:\\Users\scrap_result.csv','w',newline='')


csv_writer = csv.writer(csv_file, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
csv_writer.writerow(['headline', 'price', 'img_src'])

for page in range (1,3):
    url = "https://test.vn/products?page=/{}/".format(page)
    html = urlopen(url)
    soup = BeautifulSoup(html,"lxml")
    
for productname in soup.find_all('productname'):
    headline = productname.h6.text
    
    price= productname.find('h6',class_='product-card__name').text
    img_src = productname.find('picture',class_='product-card__image mb-3 lozad').img['src']
    
    print(headline)
    print(price)
    print(img_src)
    csv_writer.writerow([headline, price, img_src])
csv_file.close()

When i run it, it returns empty values.当我运行它时,它返回空值。 I guess im not calling the right tags but cant figure out whats wrong.我想我没有调用正确的标签,但无法弄清楚出了什么问题。

I can't see anything that is called "productname" in the html. So maybe soup.find_all(productname) returns an empty list.我在 html 中看不到任何名为“productname”的内容。所以soup.find_all(productname)可能会返回一个空列表。

By the way: the for-loops aren't nested, so the second loop starts after the first one has finished and only the last url is searched.顺便说一下:for 循环不是嵌套的,所以第二个循环在第一个循环完成后开始,并且只搜索最后一个 url。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM