繁体   English   中英

BeautifulSoup 在 HTML 中找不到元素类

[英]BeautifulSoup can't find element class in HTML

我正在尝试抓取此页面有 10 个class='name main-name' ,如下所示:示例源

但是当我编码时:

import requests
from bs4 import BeautifulSoup

result = requests.get("https://genvita.vn/thu-thach/7-ngay-detox-da-dep-dang-thon-nguoi-khoe-qua-soc-len-den-8-trieu-dong")

c = result.text
soup = BeautifulSoup(c, "html.parser")

comment_items = soup.find_all('div', class_="name main-name")
print(len(comment_items)

但返回:0 不返回:10。我尝试搜索并在 stackoverflow 中使用了许多解决方案,但无法修复

因为 div name main-name不会出现在您的DOM 在这种情况下,使用SeleniumBeautifulSoap更强大

from  selenium import webdriver

driver_path = r'Your Chrome driver path'
browser = webdriver.Chrome(executable_path=driver_path)
browser.get("https://genvita.vn/thu-thach/7-ngay-detox-da-dep-dang-thon-nguoi-khoe-qua-soc-len-den-8-trieu-dong")

get_element  = browser.find_elements_by_css_selector("div[class='name main-name']")
print len(get_element)

browser.close()

输出 :

10

您还可以获得以下名称:

 for users in get_element:
    print(users.text)

输出 :

Phạm Thị Kim Chi
My Linh Nguyen
Mr Vinh Bảo Hiểm Sức Khoẻ Sắc Đẹp
Ngô Thị Tuyết
Huỳnh Thị Bích Trâm
Linh Trúc Diêm
Nguyen Tu
Nguyen Thom
Hồ Thu Trang
Trầnthịtrắng

正如我在评论中所述,它是动态生成的。 所以这是一个 Selenium 的实现:

from selenium import webdriver
from bs4 import BeautifulSoup

url = "https://genvita.vn/thu-thach/7-ngay-detox-da-dep-dang-thon-nguoi-khoe-qua-soc-len-den-8-trieu-dong"

driver = webdriver.Chrome('C:/chromedriver_win32/chromedriver.exe')
driver.get(url)

c = driver.page_source
soup = BeautifulSoup(c, "html.parser")

comment_items = soup.find_all('div', {'class':"name main-name"})
print (len(comment_items))

driver.close()

输出:

print (len(comment_items))
10

您可以使用beautifulsoup4选择功能

import requests
from bs4 import BeautifulSoup

result = requests.get("https://genvita.vn/thu-thach/7-ngay-detox-da-dep-dang-thon-nguoi-khoe-qua-soc-len-den-8-trieu-dong")

c = result.text
soup = BeautifulSoup(c, "html.parser")

comment_items = soup.select("div.name.main-name")
print(len(comment_items))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM