繁体   English   中英

soup.findAll返回空列表

[英]soup.findAll returning empty list

当我打电话给findAll时,我正试着用汤刮,并获得一个空集

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup

my_url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

page_soup = soup(page_html,'html.parser')

containers = page_soup.findAll("div",{"class":"product"}) 
containers

我还从这些文章中获得了空数据集: findAll为html返回空

BeautifulSoup find_all()不返回任何数据

有人可以提供任何帮助吗?

页面内容加载了javascript,因此您不能只使用BeautifulSoup来解析它。 你必须使用像selenium这样的另一个模块来模拟javacript执行。

这是一个例子:

from bs4 import BeautifulSoup as soup
from selenium import webdriver

url='https://www.sainsburys.co.uk/webapp/wcs/stores/servlet/SearchDisplayView?catalogId=10123&langId=44&storeId=10151&krypto=70KutR16JmLgr7Ka%2F385RFXrzDpOkSqx%2FRC3DnlU09%2BYcw0pR5cfIfC0kOlQywiD%2BTEe7ppq8ENXglbpqA8sDUtif1h3ZjrEoQkV29%2B90iqljHi2gm2T%2BDZHH2%2FCNeKB%2BkVglbz%2BNx1bKsSfE5L6SVtckHxg%2FM%2F%2FVieWp8vgaJTan0k1WrPjCrVuDs5WnbRN#langId=44&storeId=10151&catalogId=10123&categoryId=&parent_category_rn=&top_category=&pageSize=60&orderBy=RELEVANCE&searchTerm=milk&beginIndex=0&hideFilters=true&categoryFacetId1='

driver = webdriver.Firefox()
driver.get(url)

page = driver.page_source
page_soup = soup(page,'html.parser')

containers = page_soup.findAll("div",{"class":"product"})
print(containers)
print(len(containers))

OUTPUT:

[
<div class="product "> ...
...,
<div class="product hl-product hookLogic highlighted straplineRow" ...    
]

64

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM