简体   繁体   English

将长字符串拆分成小块而不破坏 HTML 标签和单词

[英]Split long string into small chunks without breaking HTML tags and words

I am breaking long text into smaller chunks using while loop.我正在使用 while 循环将长文本分成更小的块。 My string contains html code and I dont want the user to see those open or closed braces.我的字符串包含 html 代码,我不希望用户看到那些打开或关闭的大括号。

my template string contains following text.我的模板字符串包含以下文本。

var text = "I love Stackoverflow. It helps me lot and Bla bla bla bla bla bla ";

var textString = '<div class="row page col-md-12 "><h4 style="margin-left:20px;"> 
<u> Working from home</u></h4><p style="margin:30px;">'+text+'<p></div>';

I am using the following method我正在使用以下方法

var i = true;
      var start = 0;
      var end = 20;
      var increment = 0;
        var incremented = 0;
       var val1 = textString.slice(start,end);  
        while (i == true) {                                     
             val1 = data.slice(start,end);
                var check  = val1.endsWith(' ');
            while (check == false) {
            end = end+1;
                incremented = incremented+1;
            val1 = data.slice(start,end);
                if(val1.endsWith(' ')){
                check = false;
                }else{
                check = true;
                }
                 end = end+20+incremented;
                 start = start+20+incremented;
                 if(start>textString.length){
                      i=false;
                 }
         }
}

An Example is here:一个例子在这里:

    var text1 = 'I love Stackoverflow. It helps me lot and Bla bla bla bla bla 
    bla';
    var text2 = 'Some Random Text';
    var text3 = 'Some Random Text';
    var text4 = 'Some Random Text';
    var text5 = 'Some Random Text';
    var text6 = 'Some Random Text';

    var textString = '<div class="row page col-md-12 "><h4 style="margin-left:20px;"> 
    <u> text1 </u></h4><p style="margin:30px;">'+text2+'<p></div>
    <div class="row page col-md-12 "><h4 style="margin-left:20px;"> 
    <u> text3</u></h4><p style="margin:30px;">'+text4+'<p></div>
    <div class="row page col-md-12 "><h4 style="margin-left:20px;"> 
    <u>text5</u></h4><p style="margin:30px;">'+text6+'<p></div>';

and output i need should be like我需要的 output 应该像

    arr[0] = ' <div class="row page col-md-12 "><h4 style="margin-left:20px;"> 
    <u> text1</u></h4><p style="margin:30px;">'+text2+'<p></div>';

    arr[1] = '<div class="row page col-md-12 "><h4 style="margin-left:20px;">  
    <u> text3</u></h4><p style="margin:30px;">'+text4+'<p></div>';

    arr[2] = '<div class="row page col-md-12 "><h4 style="margin-left:20px;"> 
    <u> text5</u></h4><p style="margin:30px;">'+text6+'<p></div>';

This is my Current output:这是我当前的 output: 在此处输入图像描述

HTML DOM nodes include their content so you can't split them without breaking them. HTML DOM 节点包含它们的内容,因此您不能在不破坏它们的情况下拆分它们。 The following code will convert your string into a DOM tree.以下代码会将您的字符串转换为 DOM 树。 Split off all the child nodes and re-combine them without breaking words or HTML based on the length of their text content.拆分所有子节点并根据其文本内容的长度重新组合它们而不破坏单词或 HTML。

If your data is bad and, for example, has a single paragraph that takes up more than one page, or a long series of letters with no spaces, than it is likely you will need to come up with custom solutions for each type of HTML tag and long series of characters.如果您的数据很糟糕,例如,有一个段落占用一页以上,或者有一长串没有空格的字母,那么您可能需要为每种类型的 HTML 提供自定义解决方案标签和一长串字符。

Even with this solution you may find that additional effort is need to keep pre tags within your page targets.即使使用此解决方案,您也可能会发现需要付出额外的努力才能将 pre 标记保留在您的页面目标中。

This function takes two arguments, your string and the maximum length you would like for the textContent in characters.这个 function 需要两个 arguments、您的字符串和您希望 textContent 的最大长度(以字符为单位)。

var shard = function(str, len) {

    var el = document.createElement('div');
    el.innerHTML = str;
    var child = el.firstChild;

    var parts = [];
        while(child) { 
          if (child.nodeType == 3) {
            var texts = child.nodeValue.split('')
              .reduce(function(a,b){ 
                 if (b.split(/\s/).length > 1) { 
                    a[a[a.length-1].length > 0 ? a.length: a.length - 1] = b; 
                    a[a.length] = ''
                 } else { 
                    a[a.length - 1] = a[a.length - 1] + b;
                 } return a; },['']);
            for(var idx=0; idx<texts.length; idx++) {
                parts.push(document.createTextNode(texts[idx]));
            }
          } else {
            parts.push(child);
          } 
          child = child.nextSibling; 
        }        

    var textParts = parts.map(function(el) { return el.textContent; });

    
    var partsOut = [''];

    var t = 0;

    for(var idx=0; idx<parts.length; idx++) {

        if ((t + textParts[idx].length) > len) {
          partsOut[partsOut.length] = parts[idx].nodeType == 3 ? 
              parts[idx].nodeValue : parts[idx].outerHTML;
          t = textParts[idx].length;
        } else {
          partsOut[partsOut.length - 1] = partsOut[partsOut.length - 1] + (
             parts[idx].nodeType == 3 ? 
             parts[idx].nodeValue : 
             parts[idx].outerHTML
         );
          t += textParts[idx].length;
        }

        
    }

    return partsOut;

};

This is probably not what you want to use in a production environment but it does make an attempt, where possible, to break up HTML into unbroken pieces with a target length of the text content.这可能不是您想在生产环境中使用的,但它确实会在可能的情况下尝试将 HTML 分解为具有目标文本内容长度的完整片段。

you could split the string using spaces你可以使用空格分割字符串

let wordsArray = text.split(" ")

then reduce is to whatever chunks you want然后减少到你想要的任何块

let chunks = Array()
const wordsInChunkCount = 100
let temp = wordsInChunkCount
let str = ''
wordsArray.forEach(item => {
  if (temp > 0) {
    str += ' ' + item
    temp--
  } else {
    chunks.push(str)
    str = ''
    temp = wordsInChunkCount
  }
})

after that you will have your chunks in the chunks array之后,您将在chunks数组中拥有您的块

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM