![](/img/trans.png)
[英]Python - TypeError: object of type 'NoneType' has no len()
[英]TypeError: object of type 'NoneType' has no len() in beautifulsoup & selenium Python
i am trying to get these data from the website name Flipkart.com but i am facing error i am using BeautifulSoup & selenium. i cant understand why this error is comming & i also tried many solutions available on internet.
有什么解決方案我應該嘗試任何其他網絡抓取方法,請幫忙。
網站正在使用 selenium 驅動程序打開,但無法從網站獲取數據,我無法理解為什么會發生這種情況
here is my code which i am writing ans executing.
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
#driver = webdriver.Chrome('/usr/local/bin/chromedriver')
driver = webdriver.Chrome(executable_path='chromedriver.exe')
products=[] #List to store name of the product
prices=[] #List to store price of the product
ratings=[] #List to store rating of the product
content=driver.get("https://www.flipkart.com/mobiles/pr?sid=tyy%2C4io&p%5B%5D=facets.brand%255B%255D%3DRealme&otracker=nmenu_sub_Electronics_0_Realme")
soup = BeautifulSoup(content, 'lxml')
print(soup)
for a in soup.findAll('div', attrs={'class':'bhgxx2 col-12-12'}):
name=a.find('div', attrs={'class':'_3wU53n'})
price=a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
rating=a.find('div', attrs={'class':'hGSR34'})
products.append(name.text)
prices.append(price.text)
ratings.append(rating.text)
print(rating.text)
df = pd.DataFrame({'Product Name':products,'Price':prices,'Rating':ratings})
print(df)
df.to_csv('products.csv', index=False, encoding='utf-8')
here is my error which i am getting from command.
Traceback (most recent call last):
File "C:\MachineLearning\WebScraping\web.py", line 10, in <module>
soup = BeautifulSoup(content, 'lxml')
File "C:\Users\karti\AppData\Local\Programs\Python\Python37-32\lib\site-packages\bs4\__init__.py", line 267, in __init__
elif len(markup) <= 256 and (
TypeError: object of type 'NoneType' has no len()
使用driver.get(url)
加載頁面后,必須使用driver.page_source
獲取頁面源。 driver.get(url)
不返回任何內容。
from selenium import webdriver
driver = webdriver.Chrome(executable_path='/path/to/chromedriver')
driver.get("https://www.flipkart.com/mobiles/pr?sid=tyy%2C4io&p%5B%5D=facets.brand%255B%255D%3DRealme&otracker=nmenu_sub_Electronics_0_Realme")
print(driver.page_source)
您的代碼的另一個問題是該頁面中多次使用了類bhgxx2 col-12-12
。 其中一些內部沒有產品。 這將在您的 for 循環中為您提供AttributeError
。
您的代碼的工作版本:
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
driver = webdriver.Chrome(executable_path='/path/to/chromedriver')
products = [] # List to store name of the product
prices = [] # List to store price of the product
ratings = [] # List to store rating of the product
driver.get("https://www.flipkart.com/mobiles/pr?sid=tyy%2C4io&p%5B%5D=facets.brand%255B%255D%3DRealme&otracker=nmenu_sub_Electronics_0_Realme")
soup = BeautifulSoup(driver.page_source, 'lxml')
for a in soup.findAll('div', attrs={'class':'bhgxx2 col-12-12'}):
try:
name = a.find('div', attrs={'class':'_3wU53n'})
price = a.find('div', attrs={'class':'_1vC4OE _2rQ-NK'})
rating = a.find('div', attrs={'class':'hGSR34'})
products.append(name.text)
prices.append(price.text)
ratings.append(rating.text)
except AttributeError:
pass
df = pd.DataFrame({'Product Name': products, 'Price': prices, 'Rating': ratings})
print(df)
df.to_csv('products.csv', index=False, encoding='utf-8')
輸出
Price Product Name Rating
0 ₹5,999 Realme C2 (Diamond Black, 16 GB) 4.4
1 ₹5,999 Realme C2 (Diamond Blue, 16 GB) 4.4
2 ₹8,999 Realme 3 (Radiant Blue, 32 GB) 4.5
3 ₹8,999 Realme 3 (Dynamic Black, 32 GB) 4.5
4 ₹9,999 Realme 3 (Dynamic Black, 64 GB) 4.5
5 ₹10,999 Realme 3 (Diamond Red, 64 GB) 4.4
...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.