[英]I am having some trouble with web scraping using beautifulsoup
when ever i try to extract text between tags using.text() it gives a blank screen with just [] as output当我尝试使用 .text() 在标签之间提取文本时,它会给出一个空白屏幕,其中只有 [] 作为 output
import requests
from bs4 import BeautifulSoup
page = requests.get("https://www.amazon.in/s?k=ssd&ref=nb_sb_noss")
soup = BeautifulSoup(page.content, "html.parser")
product = soup.find_all("h2",class_="a-link-normal a-text-normal")
results = soup.find_all("span",class_="a-offscreen")
print(product)
this is the output that i got:这是我得到的 output:
C:\Users\Kushal\Desktop\requests-tutorial>C:/Users/Kushal/AppData/Local/Programs/Python/Python37/python.exe c:/Users/Kushal/Desktop/requests-tutorial/scraper.py
[]
when i try listing everything with a for loop then, nothing shows up not even the empty square brackets当我尝试用 for 循环列出所有内容时,什么都没有显示,甚至没有空方括号
Based on your comment below.根据您在下面的评论。 I've modified the code to fetch all the product title on the said page along with the price details.
我修改了代码以获取所述页面上的所有产品标题以及价格详细信息。
Mark as answer if it works, else comment for further analysis.如果有效,则标记为答案,否则评论以供进一步分析。
import requests
from bs4 import BeautifulSoup
import lxml
dataList = list()
headers = {
"user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5)",
"accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"accept-charset": "cp1254,ISO-8859-9,utf-8;q=0.7,*;q=0.3",
"accept-encoding": "gzip,deflate,sdch",
"accept-language": "tr,tr-TR,en-US,en;q=0.8",
}
url = requests.get('https://www.amazon.in/s?k=ssd&ref=nb_sb_noss'.format(), headers=headers)
soup = BeautifulSoup(url.content, 'lxml')
title = soup.find_all('span', attrs={'class':'a-size-medium a-color-base a-text-normal'})
price = soup.find_all('span', attrs={'class':'a-offscreen'})
for product in zip(title,price):
title,price=product
title_proper=title.text.strip()
price_proper=price.text.strip()
print(title_proper,'-',price_proper)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.