[英]Unable to obtain table info through python selenium
我是 python selenium 環境中的新蜜蜂。 我正在嘗試從此處輸入鏈接描述獲取 SQL 版本表
from selenium.webdriver.common.by import By
from selenium import webdriver
# define the website to scrape and path where the chromediver is located
website = "https://www.sqlserverversions.com"
driver = webdriver.Chrome(executable_path='/Users//Downloads/chromedriver/chromedriver.exe')
# define 'driver' variable
# open Google Chrome with chromedriver
driver.get(website)
matches = driver.find_elements(By.TAG_NAME, 'tr')
for match in matches:
b=match.find_elements(By.XPATH,"./td[1]")
print(b.text)
它說 AttributeError: 'list' object has no attribute 'text'。 我是否選擇了寫入語法和正確的參數來獲取數據?
下面是我試圖獲取數據的表格。 在此處輸入圖像描述
以下是我試圖放入代碼的參數。
請告知在代碼中需要修改什么以獲得表格格式的數據。
謝謝,阿倫
在獲得find_elements
的結果后,您在代碼的一部分中調用b.text
,該結果返回一個列表。 您只能在單個 WebElement(而不是它們的列表)上調用b.text
。 這是更新的代碼:
from selenium.webdriver.common.by import By
from selenium import webdriver
website = "https://www.sqlserverversions.com"
driver = webdriver.Chrome(executable_path='/Users//Downloads/chromedriver/chromedriver.exe')
driver.get(website)
matches = driver.find_elements("css selector", "tr")
for match in matches[1:]:
items = match.find_elements("css selector", "td")
for item in items:
print(item.text)
這將打印出很多行,除非您限制循環。
如果您只需要來自第一個表的數據:
from selenium.webdriver.common.by import By
from selenium import webdriver
website = "https://www.sqlserverversions.com"
driver = webdriver.Chrome(executable_path='/Users//Downloads/chromedriver/chromedriver.exe')
driver.get(website)
show_service_pack_versions = True
xpath_first_table_sql_rows = "(//table[@class='tbl'])[1]//tr/td/a[starts-with(text(),'SQL Server')]//ancestor::tr"
matches = driver.find_elements(By.XPATH, xpath_first_table_sql_rows)
for match in matches:
sql_server_a_element = match.find_element(By.XPATH, "./td/a[2]")
print(sql_server_a_element.text)
sql_server_rtm_version_a_element = match.find_element(By.XPATH, ".//td[@class='rtm']")
print('RTMs:')
print(sql_server_rtm_version_a_element.text)
if(show_service_pack_versions):
print('SPs:')
sql_server_sp_version_td_elements = match.find_elements(By.XPATH, ".//td[@class='sp']")
for td in sql_server_sp_version_td_elements:
print('---')
print(td.text)
print('----------------------------------')
如果您設置show_service_pack_versions = False
則有關服務包的信息將被跳過
如果您只需要文本,則在瀏覽器端執行此操作會更簡單:
data = driver.execute_script("""
return [...document.querySelectorAll('tr')].map(tr => [...tr.querySelectorAll('td')].map(td => td.innerText))
""")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.