[英]regular expression to remove a div
我有一个类似的文件:
<div clas='dsfdsf'> this is first div </div>
<div clas='dsfdsf'> this is second div </div>
<div class="remove">
<table>
<thead>
<tr>
<th colspan="2">Mehr zum Thema</th>
</tr>
</thead>
<tbody>
<tr> this is tr</tr>
<tr> this row no 2 </tr>
</tbody>
</table>
</div>
<div clas='sasas'> this is last div </div>
我已经在这样的变量中获取了此文件内容:
$Cont = file_get_contents('myfile');
现在我想用preg_replace用类名“ remove”替换div。 我已经试过了:
$patterns = "%<div class='remove'>(.+?)</div>%";
$strPageSource = preg_replace($patterns, '', $Cont);
这没用。 此替换的正确正则表达式应该是什么?
试试这个代码。
preg_replace("/<div class='remove'>(.*?)<\/div >/i", "<div class="newClass">Newthings</div> ", $Cont);
如评论中所述,您不应使用正则表达式来解析HTML。 因为如果内部还有其他嵌套的<div>
则没有一种明智的方法来提取该<div>
。 即
<div clas='dsfdsf'> this is second div </div>
<div class="remove">
some text <div>nested div</div> more text and some elements<br />
</div>
您要做的是找到<div class="remove">
,然后以以下方式浏览HTML(解析)
1) set $nesting_counter = 0
2) proceed through HTML until you encounter either <div> or </div>
a) if found <div>
$nesting_counter++ and go to point 2)
b) if found </div>
if $nesting_counter > 0
$nesting_counter-- and go to point 2)
else
you've found the closing tag for your `<div class="remove">`. remember current position and just remove that substring.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.