![](/img/trans.png)
[英]How to extract texts from container elements while iterating over those container elements in selenium webdriver -python scraping
[英]Selenium Webdriver - How to extract texts through scraping
我正在嘗試從公司的職業網站上抓取信息。 我想獲取相應招聘廣告的參考代碼。
我想使用 Selenium 並嘗試使用 xpath 識別職位發布代碼。 當我運行代碼時,會打開一個 google Chrom 窗口並使用正確的網址:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import pandas as pd
PATH = "C:/Users/MyUser/Desktop/Driver/chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://www.uke.jobs/sap(bD1kZSZjPTUwMA==)/bc/bsp/kwp/bsp_eui_rd_uc/main.do?action=to_uc_search")
driver.maximize_window()
ref_code = driver.find_elements_by_xpath("//tr[@data-eui-handler=\"{event:'click',handler:'eui.app.controller.search_results.selectRow'}\"]/td[1]")
print(len(ref_code))
User_input = input()
運行代碼時需要永遠,我得到以下結果:
DevTools listening on ws://127.0.0.1:52187/devtools/browser/7300c3d2-42d1-4f8e-a136-4e1ce37bcb87
c:\Users\MyUser\Desktop\PyhtonVisStuCo\Selenium.py:15: DeprecationWarning: find_elements_by_xpath is deprecated. Please use find_elements(by=By.XPATH, value=xpath) instead
ref_code = driver.find_elements_by_xpath("//tr[@data-eui-handler=\"{event:'click',handler:'eui.app.controller.search_results.selectRow'}\"]/td[1]")
0
[3516:18308:0609/194039.395:ERROR:device_event_log_impl.cc(214)] [19:40:39.395] Bluetooth: bluetooth_adapter_winrt.cc:1074 Getting Default Adapter failed.
我究竟做錯了什么?
要從Referenceenzcode列中提取文本,您可以使用List Comprehension並且可以使用以下任一定位器策略:
使用CSS_SELECTOR :
driver.get("https://www.uke.jobs/sap(bD1kZSZjPTUwMA==)/bc/bsp/kwp/bsp_eui_rd_uc/main.do?action=to_uc_search") print([my_elem.text for my_elem in driver.find_elements(By.CSS_SELECTOR, "table#table_search_results tr[data-head] td:first-of-type")])
使用XPATH :
driver.get("https://www.uke.jobs/sap(bD1kZSZjPTUwMA==)/bc/bsp/kwp/bsp_eui_rd_uc/main.do?action=to_uc_search") print([my_elem.text for my_elem in driver.find_elements(By.XPATH, "//table[@id='table_search_results']//tr[@data-head]/td")])
控制台輸出:
['ZVW22192', 'ZPF2208_ex', 'ZPF2207_e', 'ZPF2206_e', 'ZMF2249', 'ZIT22484', 'ZIT22444', 'ZIT22380', 'ZIT22379', 'WS22536']
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.