当我尝试使用 BeautifulSoup（Web Scraping）从网站提取数据时出现空列表

Question

我试图从这个链接的评论中提取配置文件名称： https://www.amazon.in/Samsung-Midnight-Storage-6000mAh-Battery/dp/B0B4F52B5X/?_encoding=UTF8&pd_rd_w=4JKBg&content-id=amzn1.sym .e0e8ce89-ede3-4c51-b6ad-44989efc8536&pf_rd_p=e0e8ce89-ede3-4c51-b6ad-44989efc8536&pf_rd_r=NEBBF38XJRRBGK0BZBX3&pd_rd_wg=qFxtB&pd_rd_r=0f156162-4690-4ef5-9a8b-8b03e82e194b&ref_=pd_gw_ci_mcx_mr_hp_d&th=1

在 span 和 class_="a-profile-name" 下

但是当我尝试打印它时，它只是返回了一个空列表

下面是我的代码：

    
from bs4 import BeautifulSoup as bs 

import requests

link='https://www.amazon.in/Adidas-Unisex-Sogold-cblack-Football/dp/B096NC52HY/ref=sr_1_3_sspa?crid=1HCHWT6Y1WFYU&keywords=football%2Bshoes&qid=1660709102&sprefix=foot%2Caps%2C246&sr=8-3-spons&th=1&psc=1'

soup =bs(requests.get(link).text,"html.parser")

name = soup.find_all("span",class_= "a-profile-name")



print(name)

Answer 1

在您的请求中发送一些标头总是一个好主意，例如user-agent ：

requests.get(link, headers={'User-Agent': 'Mozilla/5.0'})

注意：亚马逊真的不喜欢被刮，所以他们迟早会检测到你的活动并可能阻止你。

例子

from bs4 import BeautifulSoup as bs 
import requests

link='https://www.amazon.in/Adidas-Unisex-Sogold-cblack-Football/dp/B096NC52HY/ref=sr_1_3_sspa?crid=1HCHWT6Y1WFYU&keywords=football%2Bshoes&qid=1660709102&sprefix=foot%2Caps%2C246&sr=8-3-spons&th=1&psc=1'

soup =bs(requests.get(link, headers={'User-Agent': 'Mozilla/5.0'}).text,"html.parser")

name = soup.find_all("span",class_= "a-profile-name")
print(name)

Output

[<span class="a-profile-name">Amazon Customer</span>, <span class="a-profile-name">Shubam Kadam</span>, <span class="a-profile-name">Aditi Sharma</span>, <span class="a-profile-name">Moris lopez</span>, <span class="a-profile-name">tana tubin</span>]

当我尝试使用 BeautifulSoup（Web Scraping）从网站提取数据时出现空列表

问题描述

1 个解决方案

解决方案1
0 2022-08-17 13:31:35

例子

Output

当我尝试使用 BeautifulSoup（Web Scraping）从网站提取数据时出现空列表

问题描述

1 个解决方案

解决方案1 0 2022-08-17 13:31:35

例子

Output

解决方案1
0 2022-08-17 13:31:35