Limited content break the HTML layout in php

Question

I am facing an issues when I tried to limit the content of description , I have tried like this :

<?php 
$intDescLt = 400;
$content   = $arrContentList[$arr->nid]['description'];
$excerpt   = substr($content, 0, $intDescLt);
?>
<div class="three16 DetailsDiv">
    <?php echo $excerpt; ?>
<div>

In the description field if I simply put the content without html tags it works fine but if I put the content with html tags and if limit reach to the end before the closing tag, It applied that tab style to all the content after that.

So I need to know that how can I resolve this issue.

Ex. Issue :

$string = "<p><b>Lorem Ipsum</b> is simply dummy text of the printing and typesetting industry.</p>";
echo substr($string, 0, 15);

Html output in console: Lorem Ipsu And now it applied that  tag to rest of the content in the page.

Expected output in console: Lorem Ipsu

Answer 1

Ok, given the example you provided:

$string = "<p><b>Lorem Ipsum</b> is simply dummy text of the printing and typesetting industry.</p>";
$substring = substr((addslashes($string)),0,15);

On possible solution is to use the DOMDocument class if you want to close all unclosed tags:

$doc = new DOMDocument();
$doc->loadHTML($substring);
$yourText = $doc->saveHTML($doc->getElementsByTagName('*')->item(2));
//item(0) = html
//item(1) = body
echo htmlspecialchars($yourText);
//<p><b>Lorem Ips</b></p>

Answer 2

You can't just use PHP's binary string functions on a HTML string and then expect things to work.

$string = "<p><b>Lorem Ipsum</b> is simply dummy text of the printing and typesetting industry.</p>";

First of all you need to formulate what kind of excerpt you'd like to create in the HTML context. Let's take an example that is concerned about the actual text-length in characters. That is not counting the size of the HTML tags. Also tags should be kept closing.

You start by creating a DOMDocument so that you can operate on the HTML fragment you have. The $string loaded will be the child-nodes of the <body> tag, so the code gets it for reference as well:

$doc    = new DOMDocument();
$result = $doc->loadHTML($string);
if (!$result) {
    throw new InvalidArgumentException('String could not be parsed as HTML fragment');
}
$body = $doc->getElementsByTagName('body')->item(0);

Next is needed to operate on all the nodes within it in document order. Iterating these nodes can be easily achieved with the help of an xpath query:

$xp    = new DOMXPath($doc);
$nodes = $xp->query('./descendant::node()', $body);

Then the logic on how to create the excerpt needs to be implemented. That is all text-nodes are taken over until their length exceeds the number of characters left. If so, they are split or if no characters are left removed from their parent:

$length = 0;
foreach ($nodes as $node) {
    if (!$node instanceof DOMText) {
        continue;
    }
    $left = max(0, 15 - $length);
    if ($left) {
        if ($node->length > $left) {
            $node->splitText($left);
            $node->nextSibling->parentNode->removeChild($node->nextSibling);
        }
        $length += $node->length;
    } else {
        $node->parentNode->removeChild($node);
    }
}

At the end you need to turn in inner HTML of the body tag into a string to obtain the result:

$buffer = '';
foreach ($body->childNodes as $node) {
    $buffer .= $doc->saveHTML($node);
}

echo $buffer;

This will give you the following result:

<p><b>Lorem Ipsum</b> is </p>

As node elements have been altered but only text-nodes, the elements are still intact. Just the text has been shortened. The Document Object Model allows you to do the traversal, the string operations as well as node-removal as needed.

As you can imagine, a more simplistic string function like substr() is not similarly capable of handling the HTML.

In reality there might be more to do: The HTML in the string might be invalid (check the Tidy extension), you might want to drop HTML attributes and tags (images, scripts, iframes) and you might also want to put the size of the tags into account. The DOM will allow you to do so.

The example in full ( online demo ):

<?php
/**
 * Limited content break the HTML layout in php
 *
 * @link http://stackoverflow.com/a/29323396/367456
 * @author hakre <http://hakre.wordpress.com>
 */

$string = "<p><b>Lorem Ipsum</b> is simply dummy text of the printing and typesetting industry.</p>";
echo substr($string, 0, 15), "\n";

$doc    = new DOMDocument();
$result = $doc->loadHTML($string);
if (!$result) {
    throw new InvalidArgumentException('String could not be parsed as HTML fragment');
}
$body = $doc->getElementsByTagName('body')->item(0);

$xp    = new DOMXPath($doc);
$nodes = $xp->query('./descendant::node()', $body);

$length = 0;
foreach ($nodes as $node) {
    if (!$node instanceof DOMText) {
        continue;
    }
    $left = max(0, 15 - $length);
    if ($left) {
        if ($node->length > $left) {
            $node->splitText($left);
            $node->nextSibling->parentNode->removeChild($node->nextSibling);
        }
        $length += $node->length;
    } else {
        $node->parentNode->removeChild($node);
    }
}

$buffer = '';
foreach ($body->childNodes as $node) {
    $buffer .= $doc->saveHTML($node);
}

echo $buffer;

Limited content break the HTML layout in php

Question

2 answers

solution1
0 2015-03-25 14:51:14

solution2
0 2015-03-28 22:33:44

Limited content break the HTML layout in php

Question

2 answers

solution1 0 2015-03-25 14:51:14

solution2 0 2015-03-28 22:33:44

solution1
0 2015-03-25 14:51:14

solution2
0 2015-03-28 22:33:44