[英]PHP find certain character in HTML tag and replace the whole tag by string
I have extracted a string value from my sql table and it is like below: 我从我的sql表中提取了一个字符串值,如下所示:
<p>Commodity Exchange on 5 April 2016 settled as following graph:</p>
<p><img alt=\"\" src=\"ckeditor/plugins/imageuploader/uploads/986dfdea.png\"
style=\"height:163px; width:650px\" /></p></p>
<p>end of string</p>
I wish to get image name 986dfdea.png inside the html tag (because there's a lot of <p></p>
tags inside the string, and I want to able to know that this tag contains image), and replace the whole tag content by a symbol, like '#image1'. 我希望在html标记内获取图像名称986dfdea.png (因为字符串中有很多<p></p>
标记,我想知道这个标记包含图像),并替换整个标记符号内容,如'#image1'。
Eventually it would become this: 最终会变成这样:
<p>Commodity Exchange on 5 April 2016 settled as following graph:</p>
#image1
<p>end of string</p>
I'm developing API for mobile apps, but having baby skill on PHP, still can't achieve my goal by referring to these references: 我正在为移动应用程序开发API,但是拥有PHP的宝贝技能,仍然无法通过引用这些引用来实现我的目标:
PHP/regex: How to get the string value of HTML tag? PHP / regex:如何获取HTML标记的字符串值?
How to extract img src, title and alt from html using php? 如何使用PHP从html中提取img src,title和alt?
Please help. 请帮忙。
Yes, you could use a regex and you'd need way less code, but we shouldn't parse html with a regex , so here's what you need: 是的,你可以使用正则表达式,你需要更少的代码,但我们不应该用正则表达式解析html ,所以这就是你需要的:
</p></p>
), so we use tidy_repair_string
to clean it. 您的字符串包含无效的html( </p></p>
),因此我们使用tidy_repair_string
来清除它。 DOMXpath()
to query for p
tags with img
tags inside 使用DOMXpath()
查询内部带有img
标记的p
标记 "
and get the image filename with getAttribute("src")
and basename
删除任何额外的"
并使用getAttribute("src")
和basename
获取图像文件basename
createTextNode
with the value of image #imagename
使用image #imagename
的值创建一个新的createTextNode
replaceChild
to replace the p
with image inside with new createTextNode
created above. 使用replaceChild
将p
与内部图像替换为上面创建的新createTextNode
。 !DOCTYPE
, html
and body
tags automatically generated by new DOMDocument();
清理由new DOMDocument();
自动生成的!DOCTYPE
, html
和body
标签new DOMDocument();
<?php
$html = <<< EOF
<p>Commodity Exchange on 5 April 2016 settled as following graph:</p>
<p><img alt=\"\" src=\"ckeditor/plugins/imageuploader/uploads/986dfdea.png\"
style=\"height:163px; width:650px\" /></p></p>
<p>end of string</p>
EOF;
$html = tidy_repair_string($html,array(
'output-html' => true,
'wrap' => 80,
'show-body-only' => true,
'clean' => true,
'input-encoding' => 'utf8',
'output-encoding' => 'utf8',
));
$dom = new DOMDocument();
$dom->loadHtml($html);
$x = new DOMXpath($dom);
foreach($x->query('//p/img') as $pImg){
//get image name
$imgFileName = basename(str_replace('"', "", $pImg->getAttribute("src")));
$replace = $dom->createTextNode("#$imgFileName");
$pImg->parentNode->replaceChild($replace, $pImg);
# loadHTML causes a !DOCTYPE tag to be added, so remove it:
$dom->removeChild($dom->firstChild);
# it also wraps the code in <html><body></body></html>, so remove that:
$dom->replaceChild($dom->firstChild->firstChild, $dom->firstChild);
echo str_replace(array("<body>", "</body>"), "", $dom->saveHTML());
}
Output: 输出:
<p>Commodity Exchange on 5 April 2016 settled as following graph:</p>
<p>#986dfdea.png</p>
<p>end of string</p>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.