[英]Why do I get AttributeError: 'NoneType' object has no attribute 'attrs'?
import requests
import bs4
import csv
from itertools import zip_longest
laptop = []
laptops_price = []
links = []
url = "https://www.jumia.com.eg/ar/catalog/?q=%D9%84%D8%A7%D8%A8%D8%AA%D9%88%D8%A8"
page = requests.get("https://www.jumia.com.eg/ar/catalog/?q=%D9%84%D8%A7%D8%A8%D8%AA%D9%88%D8%A8")
bs = bs4.BeautifulSoup(page.content, 'html.parser')
laptops = bs.find_all('h3')
laptops_prices = bs.find_all("div", {"class": "prc"})
for l in range(len(laptops)):
laptop.append(laptops[l].text)
links.append(laptops[l].find("a", {"class" : "core"}).attrs['href'])
laptops_price.append(laptops_prices[l].text)
laptops_list = [laptop, laptops_price, links]
exported = zip_longest(*laptops_list)
with open(r"C:\Users\Administrator\Desktop\jumiawep.csv", "w", encoding="utf-8") as jumialaptops:
write = csv.writer(jumialaptops)
write.writerow(["Laptop", "Price", "Links"])
write.writerows(exported)
raceback (most recent call last):
File "C:\Users\Administrator\PycharmProjects\pythonProject\main.py", line 17, in <module>
links.append(laptops[l].find("a").attrs['href'])
AttributeError: 'NoneType' object has no attribute 'attrs'
I tried to get a list
of links problem when I was scraping but i get this error.我在抓取时试图获取链接问题list
,但出现此错误。
There are different things in my opinion:我认为有不同的事情:
Cloudflare is a global network designed to make everything you connect to the Internet secure, private, fast, and reliable. Cloudflare 是一个全球网络,旨在使您连接到 Internet 的所有内容都安全、私密、快速且可靠。 Secure your websites, APIs, and Internet applications.保护您的网站、API 和 Internet 应用程序。 Protect corporate networks, employees, and devices.保护公司网络、员工和设备。 Write and deploy code that runs on the network edge.编写和部署在网络边缘运行的代码。
<h3>
do not have a child <a>
that you try to find()
, instead <h3>
is a child of <a>
<h3>
没有您尝试find()
的孩子<a>
,而是<h3>
是<a>
的孩子
avoid the bunch of lists and process your scraping in one go.避免一堆列表并一次处理您的抓取。
If you are not blocked by cloudflare and content is not rendered dynamically by javascript
this should give you the expected result.如果您没有被 cloudflare 阻止并且内容不是由javascript
动态呈现,这应该会给您预期的结果。
import requests, csv
from bs4 import BeautifulSoup
url = "https://www.jumia.com.eg/ar/catalog/?q=%D9%84%D8%A7%D8%A8%D8%AA%D9%88%D8%A8"
soup = BeautifulSoup(requests.get(url).content)
with open(r"jumiawep.csv", "w", encoding="utf-8") as jumialaptops:
write = csv.writer(jumialaptops)
write.writerow(["Laptop", "Price", "Links"])
for e in soup.select('article'):
write.writerow([
e.h3.text,
e.select_one('.prc').text,
e.a.get('href')
])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.