简体   繁体   中英

XPath for href links based on anchor text substring

I have this HTML and I need to make an XPath to find all the "A1" text and get the href of all those elements of the page. It has multiple A1 s in the page but I need all the href s.

I can't crack it.

<a href="./leitor.do?numero=20090&amp;keyword=ministro&amp;anchor=5975889&amp;origem=busca" class="edition" title="Folha de S.Paulo">
    <figure>
        <img src="https://acervo.folha.uol.com.br/files/flip/11/89/58/97/5975889/140/5975889.jpg" width="180" height="312.4">
    </figure>                                
    <h3>31.dez.2014</h3>                                           
    <p>
        país. Poder Novo <b>ministro</b> diz que Congresso irá ?expurgar? culpados futuro articulador polí
    </p>                                           
    <small>
        Folha de S.Paulo, Ano 94 - N° 20.090<br>                                
        A1 - 1 ocorrência                                
    </small>       
</a>

This XPath,

//a[contains(.,"A1")]/@href

will return all href attributes on a elements with string values that contain an "A1" substring.

You don't have to use XPath for that. You can use driver.find_elements_by_partial_link_text("A1") , and on each of the returned element, call element.get_attribute("href")

You can combine it to one line as follows:

all_hrefs=[el.get_attribute("href") for el in driver.find_elements_by_partial_link_text("A1")]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM