如何用PHP平衡标签

Question

在下面的字符串中，我想将 FOOBAR 替换为一些文本FOOBAR ，然后截断该字符串。

<p>The quick <a href="/">brown</a> fox jumps <!--more-->
over the <a href="/">lazy</a> dog.</p>

我已经到了这一点：

<p>The quick <a href="/">brown</a> fox jumps FOOBAR

...但是如您所见， <p>标签没有关闭。 关于如何始终如一地平衡标签的任何想法？ 我对PHP很陌生。

我正在使用的数组如下所示：

array(2) {
  [0]=>
  string(50) "<p>The quick <a href="/">brown</a> fox jumps "
  [1]=>
  string(45) " over the <a href="/">lazy</a> dog.</p>"
}

Answer 1

您可以使用wordpress force_balance_tags函数。 实现在这里：-

http://core.trac.wordpress.org/browser/trunk/wp-includes/formatting.php

这是一个独立的函数，您只需在代码中复制粘贴即可。

function force_balance_tags( $text ) {

用法很简单

$bad_text = "<div> <p> some text </p> " ;

回声force_balance_tags（$ bad_text）;

由于这是wordpress的一部分，因此它经过了尝试和测试，比adHoc regex解决方案更好。

Answer 2

如果可能的话，我建议将HTML解析为DOM并进行处理，遍历文本节点直到找到该字符串，然后截断文本节点并删除该节点之后的其他子节点（保留父节点不变）。 然后将DOM重新序列化为HTML。

Answer 3

我尚未对此进行全面测试，但至少可以在您的示例中使用。 假定格式正确的XML。

<?php
$reader = new XMLReader;
$writer = new XMLWriter;

// load the XML string into the XMLReader
$reader->xml('<p>The quick <a href="/">brown</a> fox jumps <!--more--> over the <a href="/">lazy</a> dog.</p>');
// write the new XML to memory
$writer->openMemory();
$done = false;

// XMLReader::read() moves the current read location to the next node
while ( !$done && $reader->read()) {
    // choose action based on the node type
    switch ($reader->nodeType) {
        case XMLReader::ELEMENT:
            // read an element, so write it back to the output
            $writer->startElement($reader->name);
            if ($reader->hasAttributes) {
                // loop through all attributes and write them
                while($reader->moveToNextAttribute()) {
                    $writer->writeAttribute($reader->name, $reader->value);
                }
                // move back to the beginning of the element
                $reader->moveToElement();
            }
            // if the tag is empty, close it now
            if ($reader->isEmptyElement) {
                $writer->endElement();
            }
            break;
        case XMLReader::END_ELEMENT:
            $writer->endElement();
            break;
        case XMLReader::TEXT:
            $writer->text($reader->value);
            break;
        case XMLReader::COMMENT:
            // you  can change this to be more flexible if you need
            // e.g. preg_match, trim, etc.
            if (trim($reader->value) == 'more') {

                // write whatever you want in here. If you have xml text
                // you want to write verbatim, use writeRaw() instead of text()
                $writer->text('FOOBAR');

                // this is where the magic happens -- endDocument closes
                // any remaining open tags
                $writer->endDocument();
                // stop the loop (could use "break 2", but that gets confusing
                $done = true;
            }
            break;
    }
}
echo $writer->outputMemory();

Answer 4

当您陈述问题时，就这么简单：

str_replace('<!--more-->', 'FOOBAR', $original_text);

也许如果您更新问题来解释整个问题与数组有什么关系，将有助于解释正确的问题-（字符串应该在数组中？）

Answer 5

您将必须在占位符文本之前找到所有已打开但未关闭的标签。 像现在一样插入新文本，然后关闭标签。

这是个草率的例子。 我认为这段代码将与所有有效的HTML一起使用，但是我并不肯定。 并且它肯定会接受无效的标记。 但无论如何：

$h = '<p>The quick <a href="/">brown</a> fox jumps <!--more-->
over the <a href="/">lazy</a> dog.</p>';

$parts = explode("<!--more-->", $h, 2);
$front = $parts[0];

/* Find all opened tags in the front string */
$tags = array();
preg_match_all("|<([a-z][\w]*)(?: +\w*=\"[\\w/%&=]+\")*>|i", $front, $tags, PREG_OFFSET_CAPTURE);
array_shift($tags); /* get rid of the complete match from preg_match_all */

/* Check if the opened arrays have been closed in the front string */
$unclosed = array();
foreach($tags as $t) {
    list($tag, $pos) = $t[0];
    if(strpos($front, "</".$tag, $pos) == false) {
        $unclosed[] = $tag;
    }
}    

/* Print the start, the replacement, and then close any open tags. */
echo $front;
echo "FOOBAR";
foreach($unclosed as $tag) {
    echo "</".$tag.">";
}

输出

<p>The quick <a href="/">brown</a> fox jumps FOOBAR</p>

如何用PHP平衡标签

问题描述

5 个解决方案

解决方案1
4 2011-10-01 08:12:40

解决方案2
2 2009-11-12 20:54:39

解决方案3
1 已采纳 2009-11-12 21:50:12

解决方案4
0 2009-11-12 20:28:54

解决方案5
0 2009-11-12 20:58:15

如何用PHP平衡标签

问题描述

5 个解决方案

解决方案1 4 2011-10-01 08:12:40

解决方案2 2 2009-11-12 20:54:39

解决方案3 1 已采纳 2009-11-12 21:50:12

解决方案4 0 2009-11-12 20:28:54

解决方案5 0 2009-11-12 20:58:15

解决方案1
4 2011-10-01 08:12:40

解决方案2
2 2009-11-12 20:54:39

解决方案3
1 已采纳 2009-11-12 21:50:12

解决方案4
0 2009-11-12 20:28:54

解决方案5
0 2009-11-12 20:58:15