简体   繁体   中英

how to split a string which has multiple repeated keywords in it to an array in javascript?

I has a string like this:

const string =  'John Smith: I want to buy 100 apples\r\nI want to buy 200 oranges\r\n, and add 300 apples';

and now I want to split the string by following keywords:

const keywords = ['John smith',  '100', 'apples', '200', 'oranges', '300'];

now I want to get result like this:

const result = [
  {isKeyword: true, text: 'John Smith'},
  {isKeyword: false, text: 'I want to buy '}, 
  {isKeyword: true, text: '100'}, 
  {isKeyword: true, text:'apples'}, 
  {isKeyword: false, text:'\r\nI want to buy'}, 
  {isKeyword: true, text:'200'},
  {isKeyword: true, text:'oranges'}, 
  {isKeyword: false, text:'\r\n, and add'},
  {isKeyword: true, text:'300'},
  {isKeyword: true, text:'apples'}];

Keywords could be lowercase or uppercase, I want to keep the string in array just the same as string.

I also want to keep the array order as the same as the string but identify the string piece in array whether it is a keyword.

How could I get it?

I would start by finding the indexes of all your keywords. From this you can make you can know where all the keywords in the sentence start and stop. You can sort this by the index of where the keyword starts.

Then it's just a matter of taking substrings up to the start of the keywords -- these will be the keyword: false substrings, then add the keyword substring. Repeat until you are done.

 const string = 'John Smith: I want to buy 100 apples\\r\\nI want to buy 200 oranges\\r\\n, and add 300 apples Thanks'; const keywords = ['John smith', '100', 'apples', '200', 'oranges', '300']; // find all indexes of a keyword function getInd(kw, arr) { let regex = new RegExp(kw, 'gi'), result, pos = [] while ((result = regex.exec(string)) != null) pos.push([result.index, result.index + kw.length]); return pos } // find all index of all keywords let positions = keywords.reduce((a, word) => a.concat(getInd(word, string)), []) positions.sort((a, b) => a[0] - b[0]) // go through the string and make the array let start = 0, res = [] for (let next of positions) { if (start + 1 < next[0]) res.push({ isKeyword: false,text: string.slice(start, next[0]).trim()}) res.push({isKeyword: true, text: string.slice(next[0], next[1])}) start = next[1] } // get any remaining text if (start < string.length) res.push({isKeyword: false, text: string.slice(start, string.length).trim()}) console.log(res) 

I'm trimming whitespace as I go, but you may want to do something different.

If you are willing to pick a delimiter


Here's a much more succinct way to do this if you are willing to pick a set of delimiters that can't appear in your text for example, use {} below

Here we simply wrap the keywords with the delimiter and then split them out. Grabbing the keyword with the delimiter makes it easy to tell which parts of the split are your keywords:

 const string = 'John Smith: I want to buy 100 apples\\r\\nI want to buy 200 oranges\\r\\n, and add 300 apples Thanks'; const keywords = ['John smith', '100', 'apples', '200', 'oranges', '300']; let res = keywords.reduce((str, k ) => str.replace(new RegExp(`(${k})`, 'ig'), '{$1}'), string) .split(/({.*?})/).filter(i => i.trim()) .map(s => s.startsWith('{') ? {iskeyword: true, text: s.slice(1, s.length -1)} : {iskeyword: false, text: s.trim()}) console.log(res) 

Use a regular expression

rx = new RegExp('('+keywords.join('|')+')')

thus

str.split(rx)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM