![](/img/trans.png)
[英]How to iterate through webelements to extract text from HTML tags in Selenium Web Automation (Python)?
[英]How to extract the text from the webelements using Selenium and Python
代碼試驗:
driver.get(url)
cards = driver.find_elements_by_class_name("job-cardstyle__JobCardComponent-sc-1mbmxes-0")
for card in cards:
data = card.get_attribute('text')
print(data)
driver.close()
driver.quit()
“卡片”返回 selenium 網絡元素,我無法通過 for 循環從中提取文本。
而不是get_attribute('text')
您需要使用text
屬性,如下所示:
data = card.text
要定位可見元素,您需要為visibility_of_all_elements_located()引入WebDriverWait ,您可以使用以下解決方案:
driver.get(url)
cards = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "job-cardstyle__JobCardComponent-sc-1mbmxes-0")))
for card in cards:
data = card.text
print(data)
注意:您必須添加以下導入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
在一行中,您可以按如下方式使用List Comprehension :
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "job-cardstyle__JobCardComponent-sc-1mbmxes-0")))])
問題出在這一行
data = card.get_attribute('text')
您可以執行以下操作:
使用.text
for card in cards: data = card.text print(data)
使用innerText
for card in cards: data = card.get_attribute('innerText') print(data)
此外,根據上面的評論,您應該打印卡片列表長度以更好地調試它。
print(len(cards))
所以如果里面有東西。
這在一定程度上起作用:
driver.get("https://www.monster.com/jobs/search?q=Python-Developer&where=Las+Vegas%2C+NV&page=1")
WebDriverWait(driver, 30).until(EC.visibility_of_all_elements_located((By.XPATH, "//*[@data-test-id = 'svx-job-title']")))
jobs = driver.find_elements(By.XPATH, "//div[contains(@class, 'job-cardstyle__JobCardHeader')]")
all_jobs = [job.text for job in jobs]
print(all_jobs)
WebdriverWait 導入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Output:
['Software engineer III\nRandstad USA\nLas Vegas, NV', 'C\nPython Developer\nconfidential\n$55 - $65 / Per Hour', 'C\nSenior Software Engineer\nCox Communications Inc\nLas Vegas, NV', 'Mission Systems Engineer\nDCS Corporation\nLas Vegas, NV', 'G\nSoftware Engineer - 914\nGCR Technical Staffing\nHenderson, NV', 'Z\nNetSuite Developer\nZone & Company Software Consulting\nLas Vegas, NV', 'IT Project Engineer\nRauland Florida by Ametek, Inc.\nSunrise, NV', 'A\nWeb Developer\nArdor Global', 'Senior Software Engineer – Node\nMeridian Technology Group Inc.']
Process finished with exit code 0
您可以使用\n
分隔符拆分列表以供進一步使用。 此外,該站點似乎是動態加載卡片的,即,當您向下滾動時,新卡片會加載,因此您可能不會在一個實例中獲得所有卡片。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.