简体   繁体   English

从特定的xpath查询获取所有HREFS

[英]Get all HREFS from specific xpath query

I have an xpath query: 我有一个xpath查询:

$q = $xpath->query("//p[@id='v{$versenumber}']/following-sibling::div[@class='admonition']");

Which works fine, very well in fact. 实际上效果很好。 And I have using the following to extract the HTML I receive from it: 我已经使用以下内容提取了从中接收的HTML:

$saveHTML = $dom->saveHTML($q->item(0));

However, Inside this query I have HREFS that I want to replace with something else. 但是,在此查询中,我有要用其他内容替换的HREFS。 I am having trouble actually recognising the hrefs. 我在实际识别href时遇到了麻烦。 I thought about having another query which would be the same, but with /a at the end, but that didn't return anything. 我考虑过要进行另一个查询,该查询是相同的,但末尾带有/a ,但没有返回任何内容。

I would have thought I could access them like this: 我以为我可以这样访问他们:

$x = $q->item(0)->getElementByTagName('a');

But that doesnt seem to work either :( what am I doing wrong? 但这似乎也不起作用:(我在做什么错?

update 更新

HTML I want to parse: 我想解析的HTML:

<p id="v1"><span class="verseref">1</span></p>
<div class="notes">
<p class="first">Notes</p>
<p class="last">Paragraph</p>
</div>
<div class="admonition">
<p class="last">HTML with <a href='foobar'>inside it</a>.  I want to get all href attributes from here.</p>
</div>

And using the above query, I can get the text fine, it's just that I want to deal with each 'href' attribute as they are wrong, and I need to change them. 使用上面的查询,我可以得到很好的文本,只是我想处理每个'href'属性,因为它们是错误的,我需要更改它们。 So I deal with each <div class'admonition'> individually, and all the hrefs inside them. 因此,我分别处理每个<div class'admonition'>及其内部的所有href。

However using: 但是使用:

$q = $xpath->query("//p[@id='v{$versenumber}']/following-sibling::div[@class='admonition']//a/@href");

I seem to get a huge amount of href's for one paragraph where there is only one. 在只有一个段落的一段中,我似乎得到了大量的href。

../../ga/ch1/#v1
#v6
#v5
#v6
../../mr/ch16/#v20
../ch12/
../../heb/ch13/#v9
../ch12/
../ch3/#v1
../../lu/ch1/#v6
../../1jo/ch1/#v8
../../1jo/ch1/#v10
../../1jo/ch1/#v7
../../1jo/ch1/#v9
#v1
../../eph/ch4/#v13
../../ro/ch14/
../../ro/ch14/#v1
../ch5/
../ch6/
../ch7/
../ch8/
../ch11/
../ch12/
../ch15/
../../ro/ch14/
#v12
../ch3/#v4
../ch15/#v24
../../eph/ch5/#v17
../../ro/ch8/#v6
../../../ot/ge/ch11/#v3
../../../ot/ps/ch133/
../../../ot/jer/ch32/#v39
../../ac/ch4/#v32
../../ro/ch12/#v16
../../ro/ch15/#v5
../../php/ch1/#v27
../../php/ch2/#v1
../../1th/ch5/#v13
../../jas/ch3/#v13
../../1pe/ch3/#v8
../../eph/ch4/#v13
../ch16/#v15
../ch16/#v17
../ch16/#v24
../../ac/ch18/#v12
../ch16/#v15
../ch16/#v17
../../ac/ch11/#v18
../../mt/ch28/#v19
../../mt/ch26/#v2
../ch2/#v14
../../ro/ch1/#v16
../../ro/ch1/#v16
../../2co/ch4/#v3
#v17
../../ac/ch20/#v30
#v18
../../../ot/isa/ch29/#v14
../../../ot/isa/ch29/#v14
../../../ot/isa/ch29/#v13
../ch2/#v14
../../ro/ch10/#v10
#v21
#v26
../ch2/
#v18
#v11
../../lu/ch6/#v38
../../../ot/ps/ch14/#v1
../../../ot/ps/ch53/#v1
../../col/ch2/#v3
#v23
#v18
../../ac/ch5/#v34
../../ac/ch26/#v24
../../ga/ch2/#v1
#v26
#v25
../../ac/ch24/#v25
../../2co/ch10/#v12
../../ro/ch7/#v18
#v30
../../ro/ch7/#v18
../../joh/ch8/#v44
../../mt/ch26/#v41
../../ro/ch8/#v18
#v26
../../../ot/isa/ch42/#v8
../../joh/ch3/#v3
../../../ot/pr/ch3/#v6
../../ro/ch8/#v23
#v26

..Which must be the entire document, otherwise I don't know where it is getting all these hrefs from. ..必须是整个文档,否则我不知道从哪里获取所有这些href。

following-sibling is an axis, not a selector, it just specifies the mode of navigation through the DOM. following-sibling是一个轴,而不是选择器,它仅指定通过DOM的导航模式。 Your following-sibling::div[@class='admonition'] asks for all the "admonition" div s that follow (at any distance) the selected p . 您的following-sibling::div[@class='admonition']要求选择的p following-sibling::div[@class='admonition'] (任意距离)跟随所有 “ admonition” div The position() function should help you solve this. position()函数应该可以帮助您解决这个问题。 Try something like following-sibling::div[@class='admonition' and position()=1] . 尝试类似following-sibling::div[@class='admonition' and position()=1]

$a_tags = $xpath->query('.//a', $q->item(0));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM