簡體   English   中英

使用 Python Selenium 從彈出窗口中提取內容

[英]Extract contents from pop-up window with Python Selenium

我想從這個網頁中提取一個人的簡歷(“John Reinsberg 是 Lazard Asset Management 的副主席,負責監督......”): https : //www.morningstar.com/funds/xnas/lziex/人們

例如看圖片

我的代碼不起作用,因為內容在彈出窗口中。 從一些現有的問題來看,似乎我需要使用 click() 然后從窗口中找到元素。 但是,我不知道如何定位要單擊的元素。 謝謝。

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument('headless')
driver = webdriver.Chrome(options=options)
driver.get('https://www.morningstar.com/funds/xnas/lziex/people')
element=driver.find_elements_by_xpath('//*[@class="sal-modal-biography ng-binding ng-scope"]')
print(element.text) 

我也嘗試過,但沒有用:

element =  driver.find_element_by_xpath("//button[@class='sal-icons sal-icons--close mds-button mds-button--icon-only']")
driver.execute_script('arguments[0].click();',element)

driver.switch_to_alert()
print(driver.find_elements_by_xpath('//*[@class="sal-modal-biography ng-binding ng-scope"]'))

這是 HTML 的一部分。

<div class="sal-component-ctn sal-modal-scrollable" style="display: block;" aria-hidden="true"><div class="sal-component-mip-manager-pop-out reveal-modal mds-modal ng-isolate-scope open" data-reveal="" manager-data="vm.managerData" style="display: block; opacity: 1; visibility: visible; top: 335.333px;" tabindex="0" aria-hidden="false">
    <div class="sal-row">
        <div class="sal-manager-modal">
            <div class="sal-manager-modal__modalHeader" ng-class="{'sal-fixed':vm.fixedHeader}" ng-style="vm.headerStyle" style="height: auto; width: auto;">
                <span class="sal-modal-header__menu">
                    <button class="sal-icons sal-icons--close mds-button mds-button--icon-only" type="button">
                        <svg class="mds-icon mds-button__icon mds-button__icon--left">
                            <use xlink:href="#remove">
                                <title class="ng-binding">Close</title>
                            </use>
                        </svg>
                    </button>
                </span>
                <div class="sal-modal-header__title ng-binding">
                    John R. Reinsberg
                </div>
            </div>
            <div class="sal-manager-modal__body" ng-style="{'margin-top': vm.headerStyle.height}" style="margin-top: auto;">
                <div class="sal-modal-dps">
                    <ul class="sal-xsmall-block-grid-2 small-block-grid-3 medium-block-grid-5 large-block-grid-5">
                                      </ul>
                </div>
                <!-- ngIf: vm.managerModalData.fundManager.biography.managerProvidedBiography || (vm.managerModalData.fundManager.CollegeEducationDetailList && vm.managerModalData.fundManager.CollegeEducationDetailList.length > 0) --><div class="sal-columns sal-small-12 sal-medium-6 sal-large-6 ng-scope" ng-if="vm.managerModalData.fundManager.biography.managerProvidedBiography || (vm.managerModalData.fundManager.CollegeEducationDetailList &amp;&amp; vm.managerModalData.fundManager.CollegeEducationDetailList.length > 0)" ng-class="{'sal-medium-12 sal-large-12': !vm.managerModalData.currentManagedFundList || vm.managerModalData.currentManagedFundList.length === 0}">
                    <!-- ngIf: vm.managerModalData.fundManager.biography.managerProvidedBiography --><div class="sal-modal-biography ng-binding ng-scope" ng-if="vm.managerModalData.fundManager.biography.managerProvidedBiography">
                        <!-- ngIf: !vm.managerModalData.fundManager.biography.isLocalized -->
                        John Reinsberg is Deputy Chairman of Lazard Asset Management responsible for oversight of the firm's international and global strategies. He is also a Portfolio Manager/Analyst on the Global Equity and International Equity portfolio teams. He began working in the investment field in 1981. Prior to joining Lazard in 1992, John was Executive Vice President with General Electric Investment Corporation and Trustee of the General Electric Pension Trust.
                    </div><!-- end ngIf: vm.managerModalData.fundManager.biography.managerProvidedBiography -->

                    </div>
                </div>
            </div>
        </div>
    </div>
</div></div>

要從網頁https://www.morningstar.com/funds/xnas/lziex/people 中提取“John Reinsberg 是 Lazard Asset Management 的副主席,負責監督...”的簡介,您需要誘導WebDriverWait element_to_be_clickable()並且您可以使用以下定位器策略

  • 代碼塊:

     options = webdriver.ChromeOptions() options.add_argument("start-maximized") options.add_experimental_option("excludeSwitches", ["enable-automation"]) options.add_experimental_option('useAutomationExtension', False) driver = webdriver.Chrome(options=options, executable_path=r'C:\\Utility\\BrowserDrivers\\chromedriver.exe') driver.get("https://www.morningstar.com/funds/xnas/lziex/people") WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='sal-management-team__memberName']/a//span[text()='Reinsberg']/.."))).click() print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.sal-modal-biography.ng-binding.ng-scope"))).text.strip())
  • 控制台輸出:

     John Reinsberg is Deputy Chairman of Lazard Asset Management responsible for oversight of the firm's international and global strategies. He is also a Portfolio Manager/Analyst on the Global Equity and International Equity portfolio teams. He began working in the investment field in 1981. Prior to joining Lazard in 1992, John was Executive Vice President with General Electric Investment Corporation and Trustee of the General Electric Pension Trust.

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM