简体   繁体   English

Preg_match在div标签中获取div标签中的内容

[英]Preg_match getting contents in div tags in div tags

I want to extract all contents of div tag with the class viewContent but the problem when I executing the my code the php stop when reach the first closing tag of div. 我想使用类viewContent提取div标签的所有内容,但是当我执行我的代码时,php到达div的第一个关闭标签时停止了问题。 What should I do guys?. 我该怎么办? I have my sample code below but still only the first div tags get. 我在下面有我的示例代码,但仍然只有第一个div标签得到。 Thank you guys for helping me. 谢谢你们对我的帮助。

  preg_match_all('#<div class="viewContent"[^>]*>(.*?)</div[^>]*>#is', $content, $s);
    print_r($s);

这是我的代码的图像。

The lazy or greedy search is of little use here because it's bound to match </div> which is not corresponding to <div class="viewContent"> . 懒惰或贪婪的搜索在这里用处不大,因为它必须匹配</div> ,而</div><div class="viewContent">不对应。 So the end comment can be of use here as it logically marks the end of desired division. 因此,结尾注释在这里可以使用,因为它在逻辑上标记了所需划分的结尾。

Using following regex only the contents of <div class="viewControl"> can be obtained. 使用以下正则表达式只能获取<div class="viewControl">的内容。

Regex: <div class="viewContent"[^>]*>(.*?)<\\/div[^>]*>(?=<!--viewContent-->) 正则表达式: <div class="viewContent"[^>]*>(.*?)<\\/div[^>]*>(?=<!--viewContent-->)

Explanation: 说明:

  • <div class="viewContent"[^>]*>(.*?)<\\/div[^>]*> This matches the division with lazy search. <div class="viewContent"[^>]*>(.*?)<\\/div[^>]*>这与带有延迟搜索的除法匹配。

  • (?=<!--viewContent-->) This positively looks ahead for comment which marks logically the end of <div> (?=<!--viewContent-->)positively looks ahead注释,这些注释在逻辑上标志着<div>的结尾

Regex101 Demo Regex101演示

If you can guarantee that the closing tag for the div you want ends with <!--viewContent--> , you can use: 如果可以保证所需div的结束标记以<!--viewContent--> ,则可以使用:

<div class="viewContent"[^>]*>(.*?)</div[^>]*><!--viewContent-->

Otherwise, you might just want to use an HTML parser. 否则,您可能只想使用HTML解析器。

You can use PHPs built in DOMDocument class to parse the html of the page and use the DOMXPath class to extract the value of an HTML element with a certain HTML class: 您可以使用DOMDocument类中内置的PHP来解析页面的html,并使用DOMXPath类来提取具有特定HTML类的HTML元素的值:

<?php
$html = '';//HTML goes here
$doc = new DOMDocument();
@$doc->loadHTML($html);
$classname = "viewContent";
$finder = new DomXPath($doc);
$spanner = $finder->query("//*[contains(@class, '$classname')]");
foreach ($spanner as $entry) {
  echo $entry->nodeValue;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM