簡體   English   中英

從 HTML 表的每一行中抓取每個元素

[英]Scraping each element from each row from an HTML table

網站鏈接: http://www.tennisabstract.com/cgi-bin/player-classic.cgi?p=RafaelNadal

我正在嘗試編寫遍歷表中每一行並從該行中提取每個元素的代碼。 我的目標是在以下布局中輸出

Row1Element1, Row1Element2, Row1Element3 
Row2Element1, Row2Element2, Row2Element3
Row3Element1, Row3Element2, Row3Element3

我在編寫這個代碼時進行了兩次主要嘗試。

嘗試1:

rows = driver.find_elements_by_xpath('//table//body//tr')
elements = rows.find_elements_by_xpath('//td')
#this gets all rows in the table, but then gets all elements on the page, 
not just the table

嘗試2:

driver.find_elements_by_xpath('//table//body//tr//td')
#this gets all the elements that I want, but makes no distinction to which 
 row each element belongs to

任何幫助表示贊賞

您可以獲取表標題並使用索引來獲取行數據中的正確順序。

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.tennisabstract.com/cgi-bin/player-classic.cgi?p=RafaelNadal")

table_headers = [th.text.strip() for th in driver.find_elements_by_css_selector("#matchheader th")]
rows = driver.find_elements_by_css_selector("#matches tbody > tr")

date_index = table_headers.index("Date")
tournament_index = table_headers.index("Tournament")
score_index = table_headers.index("Score")

for row in rows:
    table_data = row.find_elements_by_tag_name("td")
    print(table_data[date_index].text, table_data[tournament_index].text, table_data[score_index].text)

這是每行的定位器,你的意思是XPATH: //table[@id="matches"]//tbody//tr

首先導入:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

每行:

driver.get('http://www.tennisabstract.com/cgi-bin/player-classic.cgi?p=RafaelNadal')

rows = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, '//table[@id="matches"]//tbody//tr')))

for row in rows:
    print(row.text)

或每個單元格:

for row in rows:
    cols = row.find_elements_by_tag_name('td')
    for col in cols:
        print(col.text)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM