使用dom（php）解析img和html代码

Question

I have a code that parsing img and texts. 我有一个解析img和文本的代码。 Run the code in php file. 在php文件中运行代码。 It's just showing img src, abc, img src, dfe. 它只是显示img src，abc，img src，dfe。 And my code is not regular. 而且我的代码不规则。 img tags are may be with link. img标签可能带有链接。

I want to parse img and next html . 我想解析img和下一个html 。 Like this: 像这样：

Array
(
    [0] => Array
        (
            [src] => http://www.whatever.com
            [text] =>  abc
    <br>
    <h3>title</h3>
    <div class="content">content <a href="link">my link</a></div>
        )

    [1] => Array
        (
            [src] => http://goingnowhere.com
            [text] =>  def
    <br>
    <h3>title 2</h3>
    <div class="content">content <a href="link">my link</a>

    bla bla bla

    </div>
        )

)

How can I do this? 我怎样才能做到这一点？ My current code: 我当前的代码：

<?php $sample_html = '
<img src="http://www.whatever.com" alt="" />
abc
<br>
<h3>title</h3>
<div class="content">content <a href="link">my link</a></div>
<img src="http://goingnowhere.com" alt="">
def
<br>
<h3>title 2</h3>
<div class="content">content <a href="link">my link</a>

bla bla bla

</div>
';

$dom = new DOMDocument();
$dom->loadHTML($sample_html);

$data = array();
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) {
$data[] = array(
'src' => $image->getAttribute('src'),
'text' => trim($image->nextSibling->textContent),
);
}

echo '<pre>';
print_r($data); ?>

Answer 1

Use xpath to iterate through all nodes & retrieve data withing two img tags. 使用xpath遍历所有节点并使用两个img标签检索数据。

<?php $sample_html = '
<img src="http://www.whatever.com" alt="" />
abc
<br>
<h3>title</h3>
<div class="content">content <a href="link">my link</a></div>
<img src="http://goingnowhere.com" alt="">
def
<br>
<h3>title 2</h3>
<div class="content">content <a href="link">my link</a>

bla bla bla

</div>
';

$dom = new DOMDocument();
@$dom->loadHtml($sample_html);

$xpath = new DOMXPath($dom);

$snippet = '';
$arr = array();
$count = $xpath->query('//img')->length;
//loop through all img tags
for($i=0;$i<$count;$i++){

    $node = $xpath->query('//img')->item($i);
    $img_src = $node->getAttribute('src');//first image src

    while ($node = $node->nextSibling) {

      if (get_class($node) != 'DOMElement') {
        continue;
      }

      if ($node->tagName  == 'img') {
        $snippet .= $dom->saveXML($node);
        $arr[] = array(
            'src'=>$img_src,
            'content'=>$snippet
        );
        $img_src = $node->getAttribute('src');//last img src
        $snippet = '';
        break;
      }
      $snippet .= $dom->saveXML($node);

    }
}
//fill last img data
$arr[] = array('src'=>$img_src,'content'=>$snippet);

使用dom（php）解析img和html代码

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-08-23 08:03:35

使用dom（php）解析img和html代码

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-08-23 08:03:35

解决方案1
0 已采纳 2019-08-23 08:03:35