简体   繁体   English

拆分html文本而不会破坏“打开”标签

[英]Split html text without breaking “open” tags

I'm using a PHP function to split text into blocks of max N chars. 我正在使用PHP函数将文本拆分为最多N个字符的块。 Once each block is "treated" somehow, it is concatenated back again. 一旦以某种方式“处理”了每个块,则将其再次串联起来。 The problem is that the text can be HTML... and if the split occurs between open html tags, the "treatment" gets spoiled. 问题在于文本可以是HTML ...,如果在打开的html标签之间发生拆分,则“处理”会被破坏。 Can someone give a hint about breaking text only between closed tags? 有人可以提示仅在封闭标签之间打断文本吗?

Requirements: 要求:

  • Max block length: N 最大块长度:N
  • There are NO <body> tags 没有<body>标签
  • There are NO <HTML> tags 没有<HTML>标签
  • There are NO <head> tags 没有<head>标签

Adding a sample: (max block length = 173) 添加样本:(最大块长度= 173)

<div class="myclass">
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer dapibus sagittis lacus quis cursus.
</div>
<div class="anotherclass">
Nulla ligula felis, adipiscing ac varius et, sollicitudin eu lorem. Sed laoreet porttitor est, sit amet vestibulum massa pretium et. In interdum auctor nulla, ac elementum ligula aliquam eget
</div>

In the text above, given 173 chars as the limit, text would break @ "adipiscing", however that would break the <div class="anotherclass"> . 在上面的文本中,给定173个字符的限制,文本将破坏@“ adipiscing”,但是会破坏<div class="anotherclass"> In this case, the split shall occur at the first closing, although being shorter the the max limit. 在这种情况下,分割应在第一次关闭时进行,尽管会比最大限制短。

The "correct" way would be to parse the HTML and perform the shortening operations on its text nodes. “正确”的方法是解析HTML并在其文本节点上执行缩短操作。 In PHP5 you could use the DOM extension , and specifically DOMDocument::loadHTML() . 在PHP5中,您可以使用DOM扩展 ,特别是DOMDocument::loadHTML()

Hmmm I've used a code where I had to split the copy entered by a WYSIWYG and wanted to retrieve the first paragraph from it. 嗯,我使用了一个代码,其中我必须拆分所见即所得输入的副本,并想从中检索第一段。 Its little dodgy but did the trick for me. 它有点狡猾,但对我有用。 If you wanted to add in show "n" then you could add that to the "intro" var using substr. 如果要添加节目“ n”,则可以使用substr将其添加到“介绍”变量中。 Hope this starts you off :-| 希望这能使您开始:-|

function break_html_description_to_chunks($description = null)
{
    $firstParaEnd = strpos($description,"</p>");
    $firstParaEnd += 4;
    $intro = substr($description, 0, $firstParaEnd);

    $body = substr($description, $firstParaEnd, strlen($description));
    $temp = array("intro" => $intro, "body" => $body);
    return $temp;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM