Python Selenium 中的網頁抓取 - 找不到按鈕

Question

所以，我試圖從這個網頁http://www.b3.com.br/pt_br/produtos-e-servicos/negociacao/renda-variavel/empresas-listadas.htm訪問一些數據。 我試圖用 selenium 點擊名為“Setor de atuação”的按鈕。 問題是請求庫返回給我的 HTML 與我檢查頁面時看到的不同。 我已經嘗試根據我的請求發送 header ，但這不是解決方案。 雖然，當我打印內容時

browser.page_source

我仍然得到我想要的頁面的不完整部分。 為了嘗試解決問題，我看到在網站初始化時發布了兩個請求： print1

好吧，我不知道現在該怎么辦。 如果有人可以幫助我或向我發送教程，請解釋正在發生的事情，我會非常高興。 提前致謝。 我只做了簡單的網絡抓取，所以我不確定如何進行，我還檢查了論壇中的其他問題，似乎沒有一個與我的問題相似。

import bs4 as bs
import requests
from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
options.add_argument('--incognito') #private
#options.add_argument('--headless') # doesnt open page

browser = webdriver.Chrome('/home/itamar/Desktop/chromedriver', chrome_options=options)

site = 'http://www.b3.com.br/pt_br/produtos-e-servicos/negociacao/renda-variavel/empresas-listadas.htm'

browser.get(site)

到目前為止，這就是我的代碼。 我無法找到並單擊元素按鈕“Setor de Atuação”。 我嘗試過 X_path，class，id 但似乎沒有任何效果。

Answer 1

The aimed button is inside an iframe, in this case you'll have to use the switch_to function from your selenium driver, this way switching the driver to the iframe DOM, and only then you can look for the button. 我已經玩過提供的頁面並且它有效 - 雖然只使用 Selenium，但不需要 Beautiful Soup。 這是我的代碼：

from selenium import webdriver
import time

class B3:
    def __init__(self):
        self.bot = webdriver.Firefox()

    def start(self):
        bot = self.bot
        bot.get('http://www.b3.com.br/pt_br/produtos-e-servicos/negociacao/renda-variavel/empresas-listadas.htm')
        time.sleep(2)

        iframe = bot.find_element_by_xpath('//iframe[@id="bvmf_iframe"]')
        bot.switch_to.frame(iframe)
        bot.implicitly_wait(30)

        tab = bot.find_element_by_xpath('//a[@id="ctl00_contentPlaceHolderConteudo_tabMenuEmpresaListada_tabSetor"]')
        time.sleep(3)
        tab.click()
        time.sleep(2)

if __name__ == "__main__":
    worker = B3()
    worker.start()

希望它適合你！

參考： https://www.techbeamers.com/switch-between-iframes-selenium-python/

Answer 2

在這種情況下，我建議您僅使用 Selenium，因為它取決於 Javascripts 處理。

您可以檢查元素並使用 XPath 和 select 選擇元素。

XPath : //*[@id="ctl00_contentPlaceHolderConteudo_tabMenuEmpresaListada_tabSetor"]/span/span

所以你的代碼看起來像：

elementSelect = driver.find_elements_by_xpath('//*[@id="ctl00_contentPlaceHolderConteudo_tabMenuEmpresaListada_tabSetor"]/span/span')
elementSelect[0].click()
time.sleep(5)  # Wait the page to load.

PS：我建議您搜索 B3 的 API 服務。 我找到了這個鏈接，但我沒有閱讀它。 也許他們已經對這些數據進行了拆分。

關於XPath： https://www.guru99.com/xpath-selenium.html

Answer 3

我無法理解這個問題，所以如果你能顯示一個代碼片段會更好。 我建議你使用BeautifulSoup進行 web 刮。

Python Selenium 中的網頁抓取 - 找不到按鈕

問題描述

3 個解決方案

解決方案1
3 已采納 2020-05-07 04:19:33

解決方案2
1 2020-05-07 02:30:13

解決方案3
0 2020-05-07 02:03:41

Python Selenium 中的網頁抓取 - 找不到按鈕

問題描述

3 個解決方案

解決方案1 3 已采納 2020-05-07 04:19:33

解決方案2 1 2020-05-07 02:30:13

解決方案3 0 2020-05-07 02:03:41

解決方案1
3 已采納 2020-05-07 04:19:33

解決方案2
1 2020-05-07 02:30:13

解決方案3
0 2020-05-07 02:03:41