简体   繁体   English

使用PHP从HTML标记中获取子节点列表

[英]Getting the List of Child Nodes from within a HTML Tag using PHP

I am currently using the PHP DOM to get the BODY tag from HTML. 我目前正在使用PHP DOM从HTML获取BODY标签。

$doc = new DOMDocument();
$doc->loadHTML($HTML);    
$body = preg_replace("/.*<body[^>]*>|<\/body>.*/si", "", $HTML);

The above code completely gives me the html from the body tag for a given HTML. 上面的代码完全为我提供了来自body标签的html。

Can I get the HTML tags with $body as an array? 我可以用$body作为数组来获取HTML标签吗?

If possible, I would use DOM - it will make your solution a lot more reliable and cleaner to use. 如果可能的话,我将使用DOM-这将使您的解决方案更可靠,更干净。

This should get your headed in the right direction (I'm not writing the solution for you, sorry): 这应该使您朝正确的方向前进(抱歉,我没有为您编写解决方案):

$html = file_get_contents("http://google.com");
$dom = new DOMdocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);

$elements = $xpath->query("//*");


foreach ($elements as $element) {

        echo "<h1>". $element->nodeName. "</h1>";
        $nodes = $element->childNodes;

        foreach ($nodes as $node) {
                echo "<h2>".$node->nodeName. "</h2>";
                echo $node->nodeValue. "\n";
        }

}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM