简体   繁体   English

使用PHP在html的嵌套标签中获取每个标签中的标签值

[英]Get tags values from every tag inside nested tags in html with php

I'm working in code that get all tags values "Text only" from html file. 我正在使用从HTML文件获取所有标记值“仅文本”的代码。 But if any tag has nested tags it will go inside Childs and get the the tag value that hasn't a child. 但是,如果任何标签具有嵌套标签,它将进入Childs并获得没有孩子的标签值。 I tried this one but it has a bit of missing 我尝试过这个但是它有点缺失

php code: php代码:

$dochtml = new DOMDocument();
$dochtml->loadHTMLFile("index2.html");
$nodes = $dochtml ->getElementsByTagName("a"); 
gettagsvalue($nodes);
  function gettagsvalue($nodes){
    if($nodes->length != 0){
      for ($i=0;$i<$nodes->length;$i++){
        foreach ($tags=["h1","h2","h3","h4","h5","h6","h7","a","img","li","span","p","pre","i","strong","div","ul"] as $tag){  
          if($nodes->item($i)->getElementsByTagName($tag)->length != 0){
            if ($nodes->item($i)->getElementsByTagName($tag)->length == 1){
              echo "here"."<br><br><br> $tag";
              echo "<pre>" ;print_r($nodes->item($i)->getElementsByTagName($tag)->item(0));echo "</pre>" ;             
            }else{
              echo "there"."<br><br><br> $tag";
              gettagsvalue($nodes->item($i)->getElementsByTagName($tag));
              // echo "$tag <br><br><br>";
            }
            // print_r($nodes->item($i)->getElementsByTagName($tag));echo "<br>"; 
          }        
      }
    }
  }
}

i expected to get 我希望得到

"Green" "valley" “绿色山谷”

HTML: HTML:

<a href="index.html" id="aaaaaaaaaaaa2015284957">
    <img src="images/logo.png" width="50px" height="50px" id="imgaaaaaaaaaaimg732756221">
    <span>Green</span>
    <span id="spanaaaaaaaaaaspan1106733773">Valley</span>
</a>

Did you consider using textContent property? 您是否考虑过使用textContent属性? This should concatenate text nodes of all nested nodes. 这应该连接所有嵌套节点的文本节点。 See php domdocument read element inner text and PHP DOM textContent vs nodeValue? 看到php domdocument读取元素内部文本PHP DOM textContent与nodeValue? for more details. 更多细节。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM