XPath: How do I capture the previous element?

Question

I have such a construction

<p>File name</p>
<a href="https://somelink.pdf">Download</a>

I need to capture the link a and its name p using CSS and XPath. I'm trying to do the following, first I find using the CSS selector all files whose href values end in .pdf ( a[href$=".pdf"] ):

for i in response.css('a[href$=".pdf"]'):
    link = i.css('::attr("href")').get()
    name = i.xpath(?????????)
    print(name, link)

How do I capture the text in the p element using XPath?

Answer 1

Starting from `a`

This XPath,

//a[.="Download"]/preceding-sibling::p[1]

will select the first p element siblings preceding each a element whose string value equals "Download" .

Starting from `p`

This XPath,

//p[.="File name"]/following-sibling::a[1]

will select the first a element siblings following each p element whose string value equals "File name" .

In either case, you can select the text node child by appending /text() to the XPaths.

XPath: How do I capture the previous element?

Question

1 answers

solution1
0 ACCPTED 2022-01-25 19:33:55

Starting from `a`

Starting from `p`

XPath: How do I capture the previous element?

Question

1 answers

solution1 0 ACCPTED 2022-01-25 19:33:55

Starting from a

Starting from p

solution1
0 ACCPTED 2022-01-25 19:33:55

Starting from `a`

Starting from `p`