簡體   English   中英

無法通過 python selenium 獲取表信息

[英]Unable to obtain table info through python selenium

我是 python selenium 環境中的新蜜蜂。 我正在嘗試從此處輸入鏈接描述獲取 SQL 版本表

from selenium.webdriver.common.by import By
from selenium import webdriver
# define the website to scrape and path where the chromediver is located
website = "https://www.sqlserverversions.com"
driver = webdriver.Chrome(executable_path='/Users//Downloads/chromedriver/chromedriver.exe')
# define 'driver' variable
# open Google Chrome with chromedriver
driver.get(website)
matches = driver.find_elements(By.TAG_NAME, 'tr')
for match in matches:
b=match.find_elements(By.XPATH,"./td[1]")
print(b.text)

它說 AttributeError: 'list' object has no attribute 'text'。 我是否選擇了寫入語法和正確的參數來獲取數據?

下面是我試圖獲取數據的表格。 在此處輸入圖像描述

以下是我試圖放入代碼的參數。

在此處輸入圖像描述

請告知在代碼中需要修改什么以獲得表格格式的數據。

謝謝,阿倫

在獲得find_elements的結果后,您在代碼的一部分中調用b.text ,該結果返回一個列表。 您只能在單個 WebElement(而不是它們的列表)上調用b.text 這是更新的代碼:

from selenium.webdriver.common.by import By
from selenium import webdriver

website = "https://www.sqlserverversions.com"
driver = webdriver.Chrome(executable_path='/Users//Downloads/chromedriver/chromedriver.exe')
driver.get(website)
matches = driver.find_elements("css selector", "tr")
for match in matches[1:]:
    items = match.find_elements("css selector", "td")
    for item in items:
        print(item.text)

這將打印出很多行,除非您限制循環。

如果您只需要來自第一個表的數據:

from selenium.webdriver.common.by import By
from selenium import webdriver

website = "https://www.sqlserverversions.com"
driver = webdriver.Chrome(executable_path='/Users//Downloads/chromedriver/chromedriver.exe')
driver.get(website)

show_service_pack_versions = True
xpath_first_table_sql_rows = "(//table[@class='tbl'])[1]//tr/td/a[starts-with(text(),'SQL Server')]//ancestor::tr"

matches = driver.find_elements(By.XPATH, xpath_first_table_sql_rows)
for match in matches:
    sql_server_a_element = match.find_element(By.XPATH, "./td/a[2]")
    print(sql_server_a_element.text)

    sql_server_rtm_version_a_element = match.find_element(By.XPATH, ".//td[@class='rtm']")
    print('RTMs:')
    print(sql_server_rtm_version_a_element.text)

    if(show_service_pack_versions):
        print('SPs:')
        sql_server_sp_version_td_elements = match.find_elements(By.XPATH, ".//td[@class='sp']")
        for td in sql_server_sp_version_td_elements:
            print('---')
            print(td.text)

    print('----------------------------------')

如果您設置show_service_pack_versions = False則有關服務包的信息將被跳過

如果您只需要文本,則在瀏覽器端執行此操作會更簡單:

data = driver.execute_script("""
  return [...document.querySelectorAll('tr')].map(tr => [...tr.querySelectorAll('td')].map(td => td.innerText))
""")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM