[英]Beautifulsoup findAll returns an empty list
I'm trying to scrape a webpage using beautifulsoup, but findAll() returns an empty list.我正在尝试使用 beautifulsoup 抓取网页,但 findAll() 返回一个空列表。 This is my code:
这是我的代码:
URL = "https://elcinema.com/en/index/work/country/eg?page=1"
r = requests.get(URL)
bsObj = BeautifulSoup(r.content, 'html5lib')
recordList = bsObj.findAll('a', attrs = {'class':"lazy-loaded "})
print(recordList)
What am I doing wrong?我究竟做错了什么?
According to the question, it looks like you need to find all a
records who have img
tag in it with a specific class lazy-loaded
Follow the below code to get those:根据问题,您似乎需要找到所有具有
img
标签a
记录,其中包含特定类lazy-loaded
按照以下代码获取这些记录:
Code:代码:
import requests
from bs4 import BeautifulSoup
URL = "https://elcinema.com/en/index/work/country/eg?page=1"
r = requests.get(URL)
bsObj = BeautifulSoup(r.content, 'html.parser')
outputdata=[]
recordList = bsObj.findAll('a')
for record in recordList:
if record.find("img",{"class":"lazy-loaded"}):
outputdata.append(record)
print(len(outputdata))
print(outputdata)
Let me know if you have any questions:)如果您有任何问题,请告诉我:)
you need to find img
tags with a class
lazyloaded您需要找到带有
class
延迟加载的img
标签
import requests
from bs4 import BeautifulSoup
URL = "https://elcinema.com/en/index/work/country/eg?page=1"
r = requests.get(URL)
bsObj = BeautifulSoup(r.content, 'html')
recordList = bsObj.findAll('img',class_="lazy-loaded")
recordList =[i['data-src'] for i in recordList ]
print(recordList)
Output: Output:
['https://media.elcinema.com/blank_photos/75x75.jpg', 'https://media.elcinema.com/uploads/_75x75_2fe90cb32f2759181f71eb2a9b29f0735f87ac88150a6a8fd3734300f8714369.jpg', 'https://media.elcinema.com/uploads/_75x75_3d90d1ee22c5f455bc4556073eab69cd218446d6134dc0f2694782ee39ccb5bf.jpg', 'https://media.elcinema.com/uploads/_75x75_81f30061ed82645e9ee688642275d76a23ee329344c5ac25c42f22afa35432ff.jpg', 'https://media.elcinema.com/blank_photos/75x75.jpg',.......]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.