简体   繁体   中英

Selenium Python : find element whose href attribute has required keyword

The page I'm working on is in this link .

This is the relevant portion of that page:

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 
  <head>...</head>
  <body>
    ...
    <div id="searchResults">
      <div class="box-related">...</div>
      <a href='downloadDataServlet?category=true&amp;type=epar' onclick=""><img src="/ema/images/icon_download_spread.gif" />Download results to spreadsheet</a>
      <div class="table-holder">
        <table class="table-epar eparResults" border="1" cellpadding="0" cellspacing="0" summary="Search results for EPARs ordered alphabetically">
          <caption>EPAR Search results</caption>
          <thead> ... </thead>
          <tbody>
            <tr>
              <th scope="row" class="key-detail name word-wrap">
                <a href="index.jsp?curl=pages/medicines/human/medicines/000471/human_med_000619.jsp&amp;mid=WC0b01ac058001d124">Abilify</a>
              </th>
              ...
            </tr>
            <tr>...</tr>
          </tbody>
        </table>
      </div>
    </div>
  </body>
</html>

This is the XPath location of the element I wish to select:

//*[@id="searchResults"]/div[2]/table/tbody/tr[1]/th/a

But there may be many results on the searchpage, so I want to click on the link whose URL has the product number that I'm searching for (which is 000471 in this case). I want to select the <a> element which contains that string in the href attribute.

Here's what I've tried:

inp = driver.find_element_by_xpath("//*[@id='searchResults']/div[2]/table/tbody/tr[1]/th/a[contains(@href,'"+str3+"')]")
inp.click()

where str3 has the value 000471 in this case. But I keep getting NoSuchElementException .

Any help would be appreciated!

The problem is probably cause by elements which are inserted into the source code viewer or inspector when rebuilding the table. The tbody tag is usually inserted in the code when it doesn't actually exist in the real source.

You can eliminate the unnecessary steps in your XPath, if you can still obtain a unique location path to the data you wish to select. This might be sufficient:

//*[@id='searchResults']//a[contains(@href,'000471')]

If the other steps are still necessary, you can try it without the tbody .

Update I also noticed that your search page declares a namespace:

<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 
    ...

Automatic registration of default namespaces is implementation dependent. XPath requires all selectors to be qualified with a namespace. If your selenium implementation doesn't do that, you need to either register a namespace/prefix mapping, and prefix all elements in the namespace (ex: //h:table/h:tr/h:td ) or ignore the namespace, using wildcards and comparing the local name in a predicate.

If the namespace is keeping you from selecting the node, you can ignore it with this expression:

//*[@id='searchResults']//*[local-name() = 'a'][contains(@href,'000471')]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM