简体   繁体   English

"需要帮助确定正确的 XPath"

[英]Need help identifying right XPath

I'm trying to scrape all of the table from this website : https://qmjhldraft.rinknet.com/results.htm?year=2018我正在尝试从该网站上抓取所有表格: https ://qmjhldraft.rinknet.com/results.htm?year=2018

When the XPath is a simple td (like the names for example), I can scrape the table with the simple xpath being something like this :当 XPath 是一个简单的 td(例如名称)时,我可以使用简单的 xpath 来抓取表格,如下所示:

players = driver.find_elements_by_xpath('//tr[@rnid]/td[4]')

And I can scrape the players name using this code :我可以使用以下代码刮取球员姓名:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

PATH = 'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(PATH)
driver.get('https://qmjhldraft.rinknet.com/results.htm?year=2018')

try:
    elements = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.XPATH, "//tr[@rnid]/td[1]"))
    )
finally:
    players = driver.find_elements_by_xpath('//tr[@rnid]/td[4]')
    
for player in players[:5]:
    pl = player.text
    print(pl)

But when I get to the "Height" section, I can't find the write XPath.但是当我到达“高度”部分时,我找不到写入 XPath。 I guess this has to do with the td having a class, "ht-itemVisibility1", changing the way to scrape it, I've tried a few different ways to scrape it, like :我想这与 td 有一个类“ht-itemVisibility1”有关,改变了抓取它的方式,我尝试了几种不同的方式来抓取它,比如:

('//tr/td[@class="ht-itemVisibility1"][1]')
('//tr/td[@class="ht-itemVisibility1"][5]')
('//tr[@rnid]/td[5]')

to no avail.无济于事。 Can someone enlighten me on the way to capature this XPath with td class?有人可以启发我用 td 类捕获这个 XPath 的方法吗? Thanks a lot.非常感谢。

Try this试试这个

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome(ChromeDriverManager().install())
driver.get('https://qmjhldraft.rinknet.com/results.htm?year=2018')

try:
    elements = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.XPATH, "//tr[@rnid]/td[1]"))
    )
finally:
    players = driver.find_elements_by_xpath('//tr[@rnid]/td[4]')
    
for player in players[:5]:
    pl = player.text
    print(pl)

players_height = driver.find_elements_by_xpath('//tr/td[@class="ht-itemVisibility1"][1]')

for player in players_height[:5]:
    pl = player.text
    print(pl)

players_last_team = driver.find_elements_by_xpath('//tr/td[@class="ht-itemVisibility1"][5]')

for player in players_last_team[:5]:
    pl = player.text
    print(pl)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM