[英]Using find_all in BeautifulSoup to grab Class
我試圖從紐約證券交易所獲取信息,特別是 class“flex_tr”,其 html 路徑為https://www.nyse.com/quote/XNGS:AAPL
html->body->div->div.sticky-header__main->div.landing-section->div.idc-container->div->div->div.row->div.col-lg-12.col-md-12->div.d-widget.d-vbox.d-flex1.DataTable-nyse->div.d-container.d-flex1.d-vbox.d-nowrap.d-justify-start.data-table-container.d-noscroll->div.d-flex1->div.d-vbox->div.d-flex-1.d-scroll-y->div.contentContainer->div.flex_tr
應該抓取大量行,但我目前無法獲取任何行的內容。 我已經嘗試過soup.find_all("div", class_="flex_tr")
和soup.find_all("div", {"class": "flex_tr"})
,但似乎無法獲取信息。
from selenium import webdriver
from bs4 import BeautifulSoup
driver = webdriver.Chrome("C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe")
driver.get('https://www.nyse.com/quote/XNGS:AAPL')
content = driver.page_source
soup = BeautifulSoup(content, 'html.parser')
flex_tr = soup.find_all(class_="flex_tr")
print(flex_tr)
driver.close()
看起來你在元素加載到頁面之前關閉了驅動程序(返回值是一個空數組)。
Selenium 包含一些模塊,可讓您等待元素加載。 這里的這個問題更多地討論了這一點: 等待頁面加載 Selenium WebDriver for Python
至於你的問題,我能夠通過以下方式解決這個問題(這反映了上述鏈接中的最佳答案):
from selenium import webdriver
from bs4 import BeautifulSoup
# from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.common.by import By
driver = webdriver.Chrome('./chromedriver')
driver.get('https://www.nyse.com/quote/XNGS:AAPL')
delay = 5 # of seconds for the WebDriverWait param
try:
element = WebDriverWait(driver, delay).until(expected_conditions.presence_of_element_located((By.CLASS_NAME, "flex_tr")))
content = driver.page_source
soup = BeautifulSoup(content, 'html.parser')
flex_tr = soup.find_all(class_="flex_tr")
print(flex_tr)
except:
print ("Timeout")
driver.close()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.