PHP在HTML标记中找到某些字符并用字符串替换整个标记

Question

I have extracted a string value from my sql table and it is like below: 我从我的sql表中提取了一个字符串值，如下所示：

<p>Commodity Exchange on 5 April 2016 settled as following graph:</p> 
<p><img alt=\"\" src=\"ckeditor/plugins/imageuploader/uploads/986dfdea.png\" 
style=\"height:163px; width:650px\" /></p></p> 
<p>end of string</p>

I wish to get image name 986dfdea.png inside the html tag (because there's a lot of  tags inside the string, and I want to able to know that this tag contains image), and replace the whole tag content by a symbol, like '#image1'. 我希望在html标记内获取图像名称986dfdea.png （因为字符串中有很多标记，我想知道这个标记包含图像），并替换整个标记符号内容，如'＃image1'。

Eventually it would become this: 最终会变成这样：

<p>Commodity Exchange on 5 April 2016 settled as following graph:</p> 
#image1 
<p>end of string</p>

I'm developing API for mobile apps, but having baby skill on PHP, still can't achieve my goal by referring to these references: 我正在为移动应用程序开发API，但是拥有PHP的宝贝技能，仍然无法通过引用这些引用来实现我的目标：

PHP/regex: How to get the string value of HTML tag? PHP / regex：如何获取HTML标记的字符串值？

How to extract img src, title and alt from html using php? 如何使用PHP从html中提取img src，title和alt？

Please help. 请帮忙。

Answer 1

Yes, you could use a regex and you'd need way less code, but we shouldn't parse html with a regex , so here's what you need: 是的，你可以使用正则表达式，你需要更少的代码，但我们不应该用正则表达式解析html ，所以这就是你需要的：

Your string contains invalid html (  ), so we use tidy_repair_string to clean it. 您的字符串包含无效的html（  ），因此我们使用tidy_repair_string来清除它。
Use DOMXpath() to query for p tags with img tags inside 使用DOMXpath()查询内部带有img标记的p标记
Remove any extra " and get the image filename with getAttribute("src") and basename 删除任何额外的"并使用getAttribute("src")和basename获取图像文件basename
Create a new createTextNode with the value of image #imagename 使用image #imagename的值创建一个新的createTextNode
Use replaceChild to replace the p with image inside with new createTextNode created above. 使用replaceChild将p与内部图像替换为上面创建的新createTextNode 。
Cleanup the !DOCTYPE , html and body tags automatically generated by new DOMDocument(); 清理由new DOMDocument();自动生成的!DOCTYPE ， html和body标签new DOMDocument();

<?php
$html = <<< EOF
<p>Commodity Exchange on 5 April 2016 settled as following graph:</p>
<p><img alt=\"\" src=\"ckeditor/plugins/imageuploader/uploads/986dfdea.png\"
style=\"height:163px; width:650px\" /></p></p>
<p>end of string</p>
EOF;



$html = tidy_repair_string($html,array(
                           'output-html'   => true,
                           'wrap'           => 80,
                           'show-body-only' => true,
                           'clean' => true,
                           'input-encoding' => 'utf8',
                           'output-encoding' => 'utf8',
                                          ));


$dom = new DOMDocument();
$dom->loadHtml($html);



$x = new DOMXpath($dom);
foreach($x->query('//p/img') as $pImg){
    //get image name
    $imgFileName = basename(str_replace('"', "", $pImg->getAttribute("src")));
    $replace = $dom->createTextNode("#$imgFileName");
    $pImg->parentNode->replaceChild($replace, $pImg);
    # loadHTML causes a !DOCTYPE tag to be added, so remove it:
    $dom->removeChild($dom->firstChild);
    # it also wraps the code in <html><body></body></html>, so remove that:
    $dom->replaceChild($dom->firstChild->firstChild, $dom->firstChild);
    echo str_replace(array("<body>", "</body>"), "", $dom->saveHTML());

}

Output: 输出：

<p>Commodity Exchange on 5 April 2016 settled as following graph:</p>
<p>#986dfdea.png</p>
<p>end of string</p>

Ideone Demo Ideone演示

PHP在HTML标记中找到某些字符并用字符串替换整个标记

问题描述

1 个解决方案

解决方案1
3 已采纳 2016-05-13 16:13:52

PHP在HTML标记中找到某些字符并用字符串替换整个标记

问题描述

1 个解决方案

解决方案1 3 已采纳 2016-05-13 16:13:52

解决方案1
3 已采纳 2016-05-13 16:13:52