[英]XPath / PHP - Return index of specific tag that matches a regex
I'm trying to get the index of a tag which href matches certain regex, but whatever I try is throwing me a warning that says that the expression is invalid. 我正在尝试获取与某些正则表达式匹配的href索引的索引,但是我尝试执行的操作都会向我发出警告,指出该表达式无效。 Here's an example.
这是一个例子。
$dom = new DOMDocument();
$dom->loadHTML($html);
$url_check = testurl.com
$finder = new DomXPath($dom);
$finder->registerNamespace("php", "http://php.net/xpath");
$finder->registerPhpFunctions('preg_match');
//Updated to fix some errors, still invalid expression
$index = $finder->evaluate("count((/ol[@id='rso']/li[not(@id) and @class = 'g' and h3[@class='r']/a[php:function('preg_match','/^(http://|https://|ftp://)?(www(\d+)?.)?($url_check)\/?$/', string(@href) > 0)]])/preceding-sibling::*)");
$html
is a string that stores the html of a webpage, which contains something like this $html
是一个字符串,用于存储网页的html,其中包含类似这样的内容
<ol id="wrap">
<li class="list">
<h3 class="j">
<a href="http://xxxxxx.com">Not the one I'm trying to match</a>
</h3>
</li>
.
.
.
<li class="list">
<h3 class="j">
<a href="http://testurl.com">Click here</a>
</h3>
</li>
</ol>
Any suggestion is appreciated, and if you know a better/faster way to do this feel free to share :) 任何建议都将受到赞赏,如果您知道更好/更快的方法,可以随时分享:)
I found at least three problems in your expression : 我在您的表情中发现至少三个问题:
preceding-siblings
should be singular, not plural preceding-siblings
应该是单数,而不是复数 count()
function has no ending parenthesis count()
函数没有结尾括号 $url_check = testurl.com
has no quotes (should trigger a syntax error). $url_check = testurl.com
没有引号(应触发语法错误)。 fixed code : 固定代码:
$index = $finder->evaluate("count(/ol[@id='wrap']/li[@class = 'list']/h3[@class='j']/a[php:function('preg_match','/^(http://|https://|ftp://)?(www(\d+)?.)?($url_check)\/?$/', string(@href) > 0)]/preceding-sibling::li[@class='list'])");
Moreover, the example html code you give us doesn't provide any result for the expression (each <a>
element has no siblings whatsoever). 而且,您提供给我们的示例html代码不会为表达式提供任何结果(每个
<a>
元素都没有兄弟姐妹)。 So, even with these fixes, the expression still returns 0 for your test case, which is normal 因此,即使进行了这些修复,表达式仍会为您的测试用例返回0,这是正常的
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.