简体   繁体   English

如何在PHP中剪切HTML文本而不破坏标签层次

[英]How to cut an HTML text in PHP without breaking the tags hierarchy

I'm trying to trim some HTML text and found a thread but can't comment in it yet because I'm new ( Using PHP substr() and strip_tags() while retaining formatting and without breaking HTML ) 我正在尝试修剪一些HTML文本并找到了一个线程,但是尚不能在其中注释,因为我是新手( 使用PHP substr()和strip_tags(),同时保留格式并且不破坏HTML

First i create the function preview (input: html text or plain text, number of char, boolean if you want plain text output) but when I tried to extend the functionality to work with HTML tags, the problem begin 首先,我创建了功能预览 (输入:html文本或纯文本,字符数,布尔值(如果要输出纯文本)),但是当我尝试扩展功能以使用HTML标记时,问题开始了

I used the function html_cut() from the other post to close tags but I need some nested tags and I think the function closed every tag it found so it breaks the hierarchy. 我使用了html_cut()函数来关闭标签,但是我需要一些嵌套标签,并且我认为该函数会关闭它找到的每个标签,从而破坏层次结构。 (Is it in fact the problem or i'm wrong?) (实际上是问题还是我错了?)

function preview($text, $char, $sinhtml){
    if(strlen($text) > $char){
        $post = substr($text, $char, 1);
        if ($post != " "){
            $i = true;
            while($post != " "){
                if($char > 0 && $i){
                    $char--;
                    $post = substr($text, $char, 1);
                }elseif($char == 0){
                    $i = false;
                    $char++;
                }else{
                    $char++;
                    $post = substr($text, $char, 1);
                }
            }
        }
        $post = substr($text, 0, $char);
        $post .= " …";
        if($sinhtml){
            return strip_tags($post);
        }else{
-->         return $post;
        }
    }else{
        return $text;
    }
}

The input text is something like this 输入文字是这样的

<p> Some text… </p>
<ul>
   <li>Technical Description</li>
   <li>or Details (weight, size, etc.)</li>
   <li>…</li>
</ul>
<p>may be some more text</p>

The function html_cut() has a line that I´ve never seen before and don´t know what it does… $symbol = $text{$i} 函数html_cut()的一行是我以前从未见过的,不知道它的作用是…… $ symbol = $ text {$ i}

function html_cut($text, $max_length)
{
    $tags   = array();
    $result = "";

    $is_open   = false;
    $grab_open = false;
    $is_close  = false;
    $in_double_quotes = false;
    $in_single_quotes = false;
    $tag = "";

    $i = 0;
    $stripped = 0;

    $stripped_text = strip_tags($text);

    while ($i < strlen($text) && $stripped < strlen($stripped_text) && $stripped < $max_length)
    {
        $symbol  = $text{$i};
        $result .= $symbol;

        switch ($symbol)
        {
           case '<':
                $is_open   = true;
                $grab_open = true;
                break;

           case '"':
               if ($in_double_quotes)
                   $in_double_quotes = false;
               else
                   $in_double_quotes = true;

            break;

            case "'":
              if ($in_single_quotes)
                  $in_single_quotes = false;
              else
                  $in_single_quotes = true;

            break;

            case '/':
                if ($is_open && !$in_double_quotes && !$in_single_quotes)
                {
                    $is_close  = true;
                    $is_open   = false;
                    $grab_open = false;
                }

                break;

            case ' ':
                if ($is_open)
                    $grab_open = false;
                else
                    $stripped++;

                break;

            case '>':
                if ($is_open)
                {
                    $is_open   = false;
                    $grab_open = false;
                    array_push($tags, $tag);
                    $tag = "";
                }
                else if ($is_close)
                {
                    $is_close = false;
                    array_pop($tags);
                    $tag = "";
                }

                break;

            default:
                if ($grab_open || $is_close)
                    $tag .= $symbol;

                if (!$is_open && !$is_close)
                    $stripped++;
        }

        $i++;
    }

    while ($tags)
        $result .= "</".array_pop($tags).">";

    return $result;
}

Try using HTML parser or Tidy HTML. 尝试使用HTML解析器Tidy HTML。 For checking the nested tags 用于检查嵌套标签

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM