简体   繁体   中英

XPath: How do I capture the previous element?

I have such a construction

<p>File name</p>
<a href="https://somelink.pdf">Download</a>

I need to capture the link a and its name p using CSS and XPath. I'm trying to do the following, first I find using the CSS selector all files whose href values end in .pdf ( a[href$=".pdf"] ):

for i in response.css('a[href$=".pdf"]'):
    link = i.css('::attr("href")').get()
    name = i.xpath(?????????)
    print(name, link)

How do I capture the text in the p element using XPath?

Starting from a

This XPath,

//a[.="Download"]/preceding-sibling::p[1]

will select the first p element siblings preceding each a element whose string value equals "Download" .


Starting from p

This XPath,

//p[.="File name"]/following-sibling::a[1]

will select the first a element siblings following each p element whose string value equals "File name" .


In either case, you can select the text node child by appending /text() to the XPaths.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM