繁体   English   中英

不要在字符串中计算br和nbsp

[英]dont count br and nbsp in a string

我有一个字符串,其中包含“ br”和“ nbsp;” 标签,我需要的是将字符数限制为100,这意味着只能显示100个字符,因为每个“ br”取4个字符而不是100个,我得到了108个,要达到以下输出,我可以做到这一点一行

data.substr(0,100) 

输出=>

制成一本样本书。

它不仅生存了五个世纪,而且还幸免于难

但它包含br标签,我不想删除br和nbsp; 但是不要算

预期输出=>

制成一本样本书。

它不仅生存了五个世纪,而且还进入了

我已经完成了一些代码片段,但计数为108

 var data = `it to make a type specimen book. <br><br>It has survived not only five centuries, but also the leap into electronic typesetting, <br><br>remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages<br><br>, and more recently with desktop publishing software like Aldus PageMaker including&nbsp; versions of Lorem Ipsum.` // removes nbsp var docDesc = data.replace(/[&]nbsp[;]/gi," "); // removes br var stringData = docDesc.replace(/[<]br[^>]*[>]/gi,""); var subData = stringData.substr(0,100) function test(subData) { var n = subData.split(" "); return n.slice(Math.max(n.length - 5, 1)) } var lastData = test(subData); var lastString = lastData.join(" ") var finalData = data.substring(0,data.indexOf(lastString)) + lastString console.log(finalData) console.log(finalData.length) 

以最简单的形式,您可以编写一个功能类似于子字符串但不包含某些“单词”的函数,如下所示:

function substringWithExcludes(str, excludes, length) {
    let idx = 0;
    let len = 0;

    while(idx < str.length && len < length){
        let match = false;

        for(let exclude of excludes) {
            if(str.startsWith(exclude, idx)) {
                idx += exclude.length;
                match = true;
                break;
            }
        }

        if(!match) {
            len++;
            idx++;
        }
    }

    return str.substring(0, idx);
}

就像这样:

const data = `it to make a type specimen book. <br>\r\n<br>\r\nIt has survived not only five centuries, but also the leap into electronic typesetting, <br>\r\n<br>\r\nremaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages<br>\r\n<br>\r\n, and more recently with desktop publishing software like Aldus PageMaker including&nbsp; versions of Lorem Ipsum.`;

const result = substringWithExcludes(data, ["\r", "\n", "&nbsp;", "<br>"], 100);

len跟踪不带<br>的字符串的长度,而idx包括这些匹配项。 我们需要为每个排除对象做的是,首先查看它是否匹配,以及是否将长度添加到idx 如果不匹配,则需要包含一个有效字符(同时增加lenidx )。

对于较大的length可能会很慢,并且许多可能会excludes ,但这可以完成工作。 您可以添加不区分大小写的特定大小写,并在必要时匹配<br /> 必要时, startsWith可以与正则表达式匹配交换。

@DILEEP ,请看下面的代码。

如果您在理解代码时遇到任何问题,也请发表评论,我将尽力回答您。

 /* Function takes a string (data) Returns string with first 100 characters from index 0 to 100 (default) Returns string based on explicity passed values of start and end */ function get100charsNoCountBrsAndSpaces(data, start=0, end=100) { var arr = stringToArrWithNoBrsAndSpaces(data) let arrSpaces = arr.map((item) => { return item.join(' ') }) let strBrsSpaces = arrSpaces.join(' '); // "sdd fhhf fhhhf fhhf" var finalStr; var spacesCount = 0; // do { finalStr = strBrsSpaces.slice(start, end + spacesCount) spacesCount = finalStr.match(/\\s/gi).length } while(finalStr.slice(start, end + spacesCount).split(' ').join('').length < 100); return finalStr.slice(start, end + spacesCount) } /* Function that removes <br> and spaces from string (data) and returns a 2d array (it helps us in recontruction of original string) */ function stringToArrWithNoBrsAndSpaces(data) { var arrNoBrs = data.split('<br>') // console.log(JSON.stringify(arrNoBrs, null, 4)) let arrNoBrsSpaces = arrNoBrs.map((item) => { let a = []; //let: local scope of a a = item.split(' ') return a; }) // console.log(JSON.stringify(arrNoBrsSpaces, null, 4)) return arrNoBrsSpaces } /* Function which reconstructs the string from the 2 array Adds spaces and <br> at proper places */ function arrWithNoBrsAndSpacesToString(array) { let arrSpaces = array.map((item) => { return item.join(' ') }) console.log(arrSpaces) // console.log(arrSpaces) let strBrsSpaces = arrSpaces.join('<br>') return strBrsSpaces } // ********* Testing: stringToArrsWithNoBrsAndSpaces() var inputStr = `it to make a type specimen book. <br><br>It has survived not only five centuries, but also the leap into electronic typesetting, <br><br>remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages<br><br>, and more recently with desktop publishing software like Aldus PageMaker including&nbsp; versions of Lorem Ipsum.` var arr = stringToArrWithNoBrsAndSpaces(inputStr) console.log(arr) console.log('\\n') // ********* Testing: arrWithNoBrsAndSpacesToString() var str = arrWithNoBrsAndSpacesToString(arr) console.log(str) // ********* Testing: get100charsNoCountBrsAndSpaces(inputStr) var finalData = get100charsNoCountBrsAndSpaces(inputStr) console.log('finalData:', finalData) console.log('Length:', finalData.length) // 122 (100 char + 22 spaces), see below line console.log('Number of spaces:', finalData.match(/\\s/ig).length) console.log('Number of chars :', finalData.split(' ').join('').length) // 100 /* ...** Output: stringToArrsWithNoBrsAndSpaces(inputStr) **... [ [ "it", "to", "make", "a", "type", "specimen", "book.", "" ], [ "" ], [ "It", "has", "survived", "not", "only", "five", "centuries,", "but", "also", "the", "leap", "into", "electronic", "typesetting,", "" ], [ "" ], [ "remaining", "essentially", "unchanged.", "It", "was", "popularised", "in", "the", "1960s", "with", "the", "release", "of", "Letraset", "sheets", "containing", "Lorem", "Ipsum", "passages" ], [ "" ], [ ",", "and", "more", "recently", "with", "desktop", "publishing", "software", "like", "Aldus", "PageMaker", "including&nbsp;", "versions", "of", "Lorem", "Ipsum." ] ] */ /* ...** Output: arrWithNoBrsAndSpacesToString(arr) **... it to make a type specimen book. <br><br>It has survived not only five centuries, but also the leap into electronic typesetting, <br><br>remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages<br><br>, and more recently with desktop publishing software like Aldus PageMaker including&nbsp; versions of Lorem Ipsum. */ /* ...** Output: get100charsNoCountBrsAndSpaces(inputStr) **... it to make a type specimen book. <br><br>It has survived not only five centuries, but also the leap into electronic typesetting, <br><br>remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages<br><br>, and more recently with desktop publishing software like Aldus PageMaker including&nbsp; versions of Lorem Ipsum. finalData: it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, Length: 122 Number of spaces: 22 Number of chars : 100 */ 

谢谢。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM