简体   繁体   中英

Javascript RegExp to apply span tags on strings with nested substrings

Example String:

'There is a red car parked in front of a blue house with a fence painted red.'

The strings that are to be highlighted with spans are:

['red car', 'blue house', 'red'].

Expected string:

There is a <span class='redHighlight'>red car</span> parked in front of a <span class='blueHighlight'>blue house</span> with a fence painted <span class='redHighlight'>red</span>.

However when I do a replace by iterating over the array, I end up with nested span tags.

> "There is a <span class='<span class='redHighlight'>red</span>Highlight'><span class='redHighlight'>red</span> car</span> parked in front of a <span class='blueHighlight'>blue house</span> with a fence painted <span class='redHighlight'>red</span>.

Code:

let strToHighlight = 'There is a red car parked in front of a blue house with a fence painted red.';
let stringsToMatch = [{'strVal': 'red car',
                       'cssClass': 'redHighlight'}, 
                      {'strVal': 'blue house',
                        'cssClass': 'blueHighlight'},
                      {'strVal': 'red',
                       'cssClass':'redHighlight'}
                     ];
stringsToMatch.forEach(el => {
  let regEx = new RegExp(el.strVal,'g') // replace all occurances
  strToHighlight = strToHighlight.replace(regEx, `<span class='${el.cssClass}'>${el.strVal}</span>}`); 
  console.log(strToHighlight);
})

Any suggestions on how to avoid re-tagging strings either via RegEx or any other method?

EDIT: Each string has to be highlighted with different style classes. Editing the strToMatch array to array of objects holding the name of the CSS class to apply.

You need to sort the stringsToMatch by length in the descending order and use a single alternation-based pattern (with eventual word boundaries to only match whole words) to make sure the replacement are performed in one go:

 let strToHighlight = 'There is a red car parked in front of a blue house with a fence painted red.'; let stringsToMatch = [{'strVal': 'red car', 'cssClass': 'redHighlight'}, {'strVal': 'blue house', 'cssClass': 'blueHighlight'}, {'strVal': 'red','cssClass':'redHighlight'} ]; const searchTerms = stringsToMatch.map(x => x.strVal); searchTerms.sort((a, b) => b.length - a.length); let regEx = new RegExp(String.raw`\\b(?:${searchTerms.join('|')})\\b`,'g'); // => /\\b(?:blue house|red car|red)\\b/g strToHighlight = strToHighlight.replace(regEx, (m) => `<span class='${stringsToMatch.find(x => x.strVal == m).cssClass}'>${m}</span>`); console.log(strToHighlight);

Output:

There is a <span class='redHighlight'>red car</span> parked in front of a <span class='blueHighlight'>blue house</span> with a fence painted <span class='redHighlight'>red</span>.

Here,

  • stringsToMatch.sort((a, b) => b.length - a.length); sorts the strings in the array by length in the descending order. See why it is important in Remember That The Regex Engine Is Eager
  • new RegExp(String.raw`\\b(?:${stringsToMatch.join('|')})\\b`,'g') creates a RegExp object, with the \\b(?:blue house|red car|red)\\b pattern (see its demo )
  • .replace(regEx, "<span class='highlight_text'>$&</span>") replaces matches with themselves enclosed with span tags ( $& is a backreference to the whole match value).

You could replace (reduce) all matching words with positional flags and then replace all those flags with wrapped versions.

This can be used in Node JS, since there is browser detection via typeof window !== 'undefined' when checking for an element.

 /** * @param {String|Node} strOrElement - String or element (browser only) * @param {String[]} words - List of words to wrap * @param {function} replacerFn - Replacer function for each word */ const wrapWords = (strOrElement, words, replacerFn) => { const isEl = typeof window !== 'undefined' && strOrElement instanceof Node; if (!isEl && typeof strOrElement !== 'string') { throw new Error('must be text or an element'); } const sorted = words.slice().sort().reverse(); const text = isEl ? strOrElement.textContent : strOrElement; const result = sorted .reduce((curr, word, idx) => curr.replace(word, `$\\{${idx}\\}`), text) .replace(/\\$\\{(\\d+)\\}/g, (m, p1) => replacerFn(sorted[parseInt(p1, 10)])); if (isEl) strOrElement.innerHTML = result; return result; }; const words = [ 'red car', 'blue house', 'red' ]; const el = document.querySelector('p') const fn = word => `<span class="highlight_text">${word}</span>`; wrapWords(el, words, fn);
 .highlight_text { background: #FFA; border: thin dashed red; padding: 0.125em 0.25em; }
 <p>There is a red car parked in front of a blue house with a fence painted red.</p>


Extensibility

If you want a mix of text and regular expressions in your "word list", you can modify the function above to cache each match. In this example, you are making replacements via the cache, rather than the reverse-sorted list of words.

 /** * @param {String|Node} strOrElement - String or element (browser only) * @param {String[]} words - List of words to wrap * @param {function} replacerFn - Replacer function for each word */ const wrapWords = (strOrElement, words, replacerFn) => { const isEl = typeof window !== 'undefined' && strOrElement instanceof Node; if (!isEl && typeof strOrElement !== 'string') { throw new Error('must be text or an element'); } const text = isEl ? strOrElement.textContent : strOrElement; const cache = []; const result = words.slice().sort().reverse() .reduce((curr, word, idx) => curr.replace(word, (m) => { cache[idx] = [ ...(cache[idx] || []), m ]; return `$\\{${idx}\\}`; }), text) .replace(/\\$\\{(\\d+)\\}/g, (m, p1) => { return replacerFn(cache[parseInt(p1, 10)].pop()); }); if (isEl) strOrElement.innerHTML = result; return result; }; const wl1 = [ 'red car', 'blue house', 'red' ]; const el1 = document.querySelector('p:nth-child(1)') wrapWords(el1, wl1, word => `<span class="highlight_text">${word}</span>`); const wl2 = [ /\\$\\d+.\\d+/g ]; const el2 = document.querySelector('p:nth-child(2)') wrapWords(el2, wl2, word => `<span class="highlight_text">${word}</span>`);
 .highlight_text { background: #FFA; border: thin dashed red; padding: 0.125em 0.25em; }
 <p>There is a red car parked in front of a blue house with a fence painted red.</p> <p>The price of the watch was reduced from $500.00 down to $199.99.</p>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM