简体   繁体   English

使用正则表达式拆分未包含在 div 或 Table 中的段落

[英]Use Regex to Split Paragraphs that are not wrapped in div or Table

I am trying to insert some text after every paragraph in my content.我试图在我的内容中的每个段落之后插入一些文本。

I explode my content by </p> It is done using following code:我通过</p>我的内容它是使用以下代码完成的:

    $Paragraphs = explode( '</p>', $Content);
    foreach($Paragraphs as $Paragraph){
        // Some code
    }

Now my $Content looks like:现在我的 $Content 看起来像:

<p></p>
<p></p>
<p></p>
<div><p></p></div>
<p></p>
<p></p>
<div><p></p></div>

I want to split if <p> isn't wrapped inside <div> or <table> of anything else.如果<p>没有包裹在其他任何东西的<div><table> ,我想拆分。

You can say that the </p> should have a <p> after it.你可以说</p> <p>后面应该有一个<p>

I read Regex can be helpful in achieveing it.我读过正则表达式可以帮助实现它。

Here's the basic regex I built:这是我构建的基本正则表达式:


$Pattern = '/<p(|\s+[^>]*)>(.*?)<\/p\s*>/';

if(preg_match_all($Pattern, $Content, $keywords)){

}

This regex currently removes the此正则表达式当前删除了

itself from the array, it keeps content inside p but not the本身来自数组,它将内容保留在 p 中,但不保留

itself, and it doesn't check for it being either having a本身,并且它不会检查它是否具有

before it or 在它之前或

after it.之后。

If i understood your problem you have a string with tags such as:如果我理解你的问题,你有一个带有标签的字符串,例如:

$string = "
<p> Sometext 1 </p>
<p> Sometext 2 </p>
<p> Sometext 3 </p>
<div><p> Sometext Inside A Div </p> </div>
";

And you want to add another element right after each p that is not contained in any other element.并且您想在每个不包含在任何其他元素中的p之后添加另一个元素。 And you want to do that purely through PHP, correct ?你想纯粹通过 PHP 来做到这一点,对吗?

In my opinion your best option is using DOMDocument .在我看来,您最好的选择是使用DOMDocument

Take a look at the solution below:看看下面的解决方案:

$doc = new DOMDocument();
$doc->loadHTML($string);
foreach ($doc->getElementsByTagName('p') as $idx => $item) {
    if($item->parentNode->nodeName == 'body') {
        $object = $doc->createElement('span', "some new text");
        $item->parentNode->insertBefore($object, $item->nextSibling);
    }
}    

echo $doc->saveHTML();

Basically i am taking your string converting it into an HTML DOM then i iterate through all the p elements and if their parent is body then i create a new span element (can be whatever element) with a text and i append it after the iterated item.基本上,我将您的字符串转换为 HTML DOM,然后遍历所有p元素,如果它们的父元素是body,则我创建一个带有文本的新span元素(可以是任何元素),然后将其附加在迭代项之后.

The output will look something like this:输出将如下所示:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
    <body>
        <p> Sometext 1 </p>
        <span>some new text</span>
        <p> Sometext 2 </p>
        <span>some new text</span>
        <p> Sometext 3 </p>
        <span>some new text</span>
        <div>
            <p> Sometext Inside A Div </p> 
        </div>
    </body>
</html>

For more complex minipulations on the DOM element, I would recommend using DomDocument.对于 DOM 元素上更复杂的微处理,我建议使用 DomDocument。 ( https://www.php.net/manual/en/class.domdocument.php ) ( https://www.php.net/manual/en/class.domdocument.php )

PHP Solution You can use the PHP string function str_replace for this. PHP 解决方案您可以为此使用 PHP 字符串函数 str_replace。 In your loop you can build the replace string and then pass it into the str_replace function as a parameter.在您的循环中,您可以构建替换字符串,然后将其作为参数传递给 str_replace 函数。

$text = '<p>hello</p> <p>Hi</p>';
$replace = '</p><span style="color: red;">World</span>';

echo str_replace("</p>",$replace,$text);

CSS Solution for simple Content You can just do it with pure css.简单内容的 CSS 解决方案您可以使用纯 css 来完成。

 p::after { content: " - World"; }
 <p>1 x Hello</p> <p>2 x Hello</p>

$string = '<p></p>
<p></p>
<p></p>
<div><p></p></div>
<p></p>
<p></p>
<div><p></p></div>';

$ex = explode("\n",$string);

foreach($ex as $k => $p){
    if(str_contains($p,"<div>") || str_contains($p,"<table>")){
        unset($ex[$k]);
    }
}

print_r($ex);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM