简体   繁体   English

如何使用 xpath 中的包含查找 aria -label 元素

[英]How to find aria -label element using contains in xpath

I'm trying to get an information which is within the anchor tag but not the href.我正在尝试获取位于锚标记内但不是 href 的信息。 I want to extract the rating score from a few sellers on ebay.我想从 ebay 上的几个卖家那里提取评分。 In the following HTML-Code you can see where the rating score can be found.在以下 HTML 代码中,您可以看到可以找到评分的位置。 Is there a way to get the information about the "Bewertungspunktestand" (german for rating score) without using the href, because the href changes from seller to seller.有没有办法在不使用 href 的情况下获取有关“Bewertungspunktestand”(德语评分)的信息,因为 href 从卖家到卖家会发生变化。 The rating score in this example would be 32. Since the text "Bewertungspunktestand" is only in this line, I thought it would be possible to let it search for this text and extract the aria-label with this text in it.此示例中的评分为 32。由于文本“Bewertungspunktestand”仅在这一行中,我认为可以让它搜索该文本并提取其中包含该文本的 aria-label。

This is the link of this example: https://www.ebay.de/itm/Apple-MacBook-Pro-15-Laptop-mit-Touchbar-512GB-MPTT2D-A-Wie-neu/133585540546?nordt=true&nma=true&orig_cvip=true这是此示例的链接: https://www.ebay.de/itm/Apple-MacBook-Pro-15-Laptop-mit-Touchbar-512GB-MPTT2D-A-Wie-neu/133585540546?nordt=true&nma=真&orig_cvip=真

This is the python-code i tried and didn't worked out:这是我尝试过但没有成功的python代码:

try: trans = driver.find_element_by_xpath("//a[@aria-label='Bewertungspunktestand']") except: trans = '0'尝试:trans = driver.find_element_by_xpath("//a[@aria-label='Bewertungspunktestand']") 除外:trans = '0'

And this is the HTML-Code这是 HTML 代码

 <span class="mbg-l"> (<a href="http://feedback.ebay.de/ws/eBayISAPI.dll?ViewFeedback&amp;userid=thuanhtran&amp;iid=133585540546&amp;ssPageName=VIP:feedback&amp;ftab=FeedbackAsSeller&amp;rt=nc&amp;_trksid=p2047675.l2560" aria-label="Bewertungspunktestand: 32">32</a> <span class="vi-mbgds3-bkImg vi-mbgds3-fb10-49" aria-label="Gelber Stern für 10 bis 49 Bewertungspunkte" role="img"></span>) </span>

Sure you can.你当然可以。 Use XPATH's contains method, combined with the abiltiy to select any attribute (@aria-label):使用 XPATH 的 contains 方法,结合 select 任何属性(@aria-label)的能力:

//a[contains(@aria-label, 'Bewertungspunktestand:')]

Specifically to get the text value of that link element:专门获取该链接元素的文本值:

trans = driver.find_element_by_xpath("//a[contains(@aria-label, 'Bewertungspunktestand:')]").text

The value of aria-label attribute isn't Bewertungspunktestand but Bewertungspunktestand: 32 . aria-label属性的值不是Bewertungspunktestand而是Bewertungspunktestand: 32

To print the value ie 32 from the innerHTML you can use either of the following Locator Strategies :要从innerHTML打印值 ie 32 ,您可以使用以下任一Locator Strategies

  • Using css_selector and text attribute:使用css_selectortext属性:

     driver.get('https://www.ebay.de/itm/Apple-MacBook-Pro-15-Laptop-mit-Touchbar-512GB-MPTT2D-A-Wie-neu/133585540546?nordt=true&nma=true&orig_cvip=true') print(driver.find_element_by_css_selector("a[aria-label^='Bewertungspunktestand']").text)
  • Using xpath and get_attribute() :使用xpathget_attribute()

     driver.get('https://www.ebay.de/itm/Apple-MacBook-Pro-15-Laptop-mit-Touchbar-512GB-MPTT2D-A-Wie-neu/133585540546?nordt=true&nma=true&orig_cvip=true') print(driver.find_element_by_xpath("//a[starts-with(@aria-label, 'Bewertungspunktestand')]").get_attribute("innerHTML"))

Ideally you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies :理想情况下,您需要为visibility_of_element_located()引入WebDriverWait ,并且您可以使用以下任一Locator Strategies

  • Using CSS_SELECTOR and get_attribute() :使用CSS_SELECTORget_attribute()

     driver.get('https://www.ebay.de/itm/Apple-MacBook-Pro-15-Laptop-mit-Touchbar-512GB-MPTT2D-A-Wie-neu/133585540546?nordt=true&nma=true&orig_cvip=true') print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a[aria-label^='Bewertungspunktestand']"))).get_attribute("innerHTML"))
  • Using XPATH and text attribute:使用XPATH文本属性:

     driver.get('https://www.ebay.de/itm/Apple-MacBook-Pro-15-Laptop-mit-Touchbar-512GB-MPTT2D-A-Wie-neu/133585540546?nordt=true&nma=true&orig_cvip=true') print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//a[starts-with(@aria-label, 'Bewertungspunktestand')]"))).text)
  • Console Output:控制台 Output:

     MyMercy User
  • Note : You have to add the following imports:注意:您必须添加以下导入:

     from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python您可以在如何使用 Selenium - Python 检索 WebElement 的文本中找到相关讨论


Outro奥特罗

Link to useful documentation:链接到有用的文档:

From your query what I understand is that you want to get all the aria-label in the page.从您的查询中,我了解到您想要获取页面中的所有 aria-label。 Below XPath will return all the the aria-label values on the webpage which you can traverse through using loop. XPath 下面将返回网页上的所有 aria-label 值,您可以使用循环遍历这些值。

//span[@class='mbg-l']/a/@aria-label //span[@class='mbg-l']/a/@aria-label

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM