有限的內容破壞了php中的HTML布局

Question

當我試圖限制description內容時，我遇到了一個問題，我試圖這樣做：

<?php 
$intDescLt = 400;
$content   = $arrContentList[$arr->nid]['description'];
$excerpt   = substr($content, 0, $intDescLt);
?>
<div class="three16 DetailsDiv">
    <?php echo $excerpt; ?>
<div>

在描述字段中，如果我只放置不帶有html標記的內容，則可以正常工作，但是如果我放置帶有html標記的內容，並且如果限制到結束標記之前的末尾，它將制表符樣式應用於之后的所有內容。

因此，我需要知道如何解決此問題。

防爆。 問題：

$string = "<p><b>Lorem Ipsum</b> is simply dummy text of the printing and typesetting industry.</p>";
echo substr($string, 0, 15);

在控制台中的HTML輸出： Lorem Ipsu現在，它已將標記應用於頁面中的其余內容。

控制台中的預期輸出： Lorem Ipsu

Answer 1

好的，鑒於您提供的示例：

$string = "<p><b>Lorem Ipsum</b> is simply dummy text of the printing and typesetting industry.</p>";
$substring = substr((addslashes($string)),0,15);

如果要關閉所有未關閉的標記，則可能的解決方案是使用DOMDocument類：

$doc = new DOMDocument();
$doc->loadHTML($substring);
$yourText = $doc->saveHTML($doc->getElementsByTagName('*')->item(2));
//item(0) = html
//item(1) = body
echo htmlspecialchars($yourText);
//<p><b>Lorem Ips</b></p>

Answer 2

您不能只在HTML字符串上使用PHP的二進制字符串函數，然后期望一切正常。

$string = "<p><b>Lorem Ipsum</b> is simply dummy text of the printing and typesetting industry.</p>";

首先，您需要確定要在HTML上下文中創建哪種摘錄。 讓我們舉一個有關實際字符長度的示例。 即不計算HTML標簽的大小。 標簽也應保持關閉狀態。

首先創建一個DOMDocument，以便可以對已有的HTML片段進行操作。 加載的$string將是<body>標簽的子節點，因此代碼也將其作為參考：

$doc    = new DOMDocument();
$result = $doc->loadHTML($string);
if (!$result) {
    throw new InvalidArgumentException('String could not be parsed as HTML fragment');
}
$body = $doc->getElementsByTagName('body')->item(0);

接下來需要按文檔順序在其中的所有節點上進行操作。 借助xpath查詢可以輕松實現對這些節點的迭代：

$xp    = new DOMXPath($doc);
$nodes = $xp->query('./descendant::node()', $body);

然后，需要實現有關如何創建摘錄的邏輯。 也就是說，所有文本節點都將被接管，直到它們的長度超過剩余的字符數為止。 如果是這樣，它們將被拆分，或者如果沒有從其父級中刪除任何字符，則：

$length = 0;
foreach ($nodes as $node) {
    if (!$node instanceof DOMText) {
        continue;
    }
    $left = max(0, 15 - $length);
    if ($left) {
        if ($node->length > $left) {
            $node->splitText($left);
            $node->nextSibling->parentNode->removeChild($node->nextSibling);
        }
        $length += $node->length;
    } else {
        $node->parentNode->removeChild($node);
    }
}

最后，您需要將body標簽的內部HTML轉換為字符串以獲取結果：

$buffer = '';
foreach ($body->childNodes as $node) {
    $buffer .= $doc->saveHTML($node);
}

echo $buffer;

這將為您提供以下結果：

<p><b>Lorem Ipsum</b> is </p>

由於節點元素已更改，但僅文本節點已更改，因此這些元素仍然完好無損。 只是文本已被縮短。 通過文檔對象模型，您可以根據需要進行遍歷，字符串操作以及節點刪除。

可以想象，像substr()這樣更簡單的字符串函數不能類似地處理HTML。

實際上，可能還有更多工作要做：字符串中的HTML可能無效（請檢查Tidy擴展名），您可能希望刪除HTML屬性和標簽（圖像，腳本，iframe），並且還可能希望將標簽。 DOM將允許您這樣做。

完整示例（在線演示）：

<?php
/**
 * Limited content break the HTML layout in php
 *
 * @link http://stackoverflow.com/a/29323396/367456
 * @author hakre <http://hakre.wordpress.com>
 */

$string = "<p><b>Lorem Ipsum</b> is simply dummy text of the printing and typesetting industry.</p>";
echo substr($string, 0, 15), "\n";

$doc    = new DOMDocument();
$result = $doc->loadHTML($string);
if (!$result) {
    throw new InvalidArgumentException('String could not be parsed as HTML fragment');
}
$body = $doc->getElementsByTagName('body')->item(0);

$xp    = new DOMXPath($doc);
$nodes = $xp->query('./descendant::node()', $body);

$length = 0;
foreach ($nodes as $node) {
    if (!$node instanceof DOMText) {
        continue;
    }
    $left = max(0, 15 - $length);
    if ($left) {
        if ($node->length > $left) {
            $node->splitText($left);
            $node->nextSibling->parentNode->removeChild($node->nextSibling);
        }
        $length += $node->length;
    } else {
        $node->parentNode->removeChild($node);
    }
}

$buffer = '';
foreach ($body->childNodes as $node) {
    $buffer .= $doc->saveHTML($node);
}

echo $buffer;

有限的內容破壞了php中的HTML布局

問題描述

2 個解決方案

解決方案1
0 2015-03-25 14:51:14

解決方案2
0 2015-03-28 22:33:44

有限的內容破壞了php中的HTML布局

問題描述

2 個解決方案

解決方案1 0 2015-03-25 14:51:14

解決方案2 0 2015-03-28 22:33:44

解決方案1
0 2015-03-25 14:51:14

解決方案2
0 2015-03-28 22:33:44