简体   繁体   English

如何将其中具有多个重复关键字的字符串拆分为javascript中的数组?

[英]how to split a string which has multiple repeated keywords in it to an array in javascript?

I has a string like this: 我有一个像这样的字符串:

const string =  'John Smith: I want to buy 100 apples\r\nI want to buy 200 oranges\r\n, and add 300 apples';

and now I want to split the string by following keywords: 现在我想通过以下关键字拆分字符串:

const keywords = ['John smith',  '100', 'apples', '200', 'oranges', '300'];

now I want to get result like this: 现在我想得到这样的结果:

const result = [
  {isKeyword: true, text: 'John Smith'},
  {isKeyword: false, text: 'I want to buy '}, 
  {isKeyword: true, text: '100'}, 
  {isKeyword: true, text:'apples'}, 
  {isKeyword: false, text:'\r\nI want to buy'}, 
  {isKeyword: true, text:'200'},
  {isKeyword: true, text:'oranges'}, 
  {isKeyword: false, text:'\r\n, and add'},
  {isKeyword: true, text:'300'},
  {isKeyword: true, text:'apples'}];

Keywords could be lowercase or uppercase, I want to keep the string in array just the same as string. 关键字可以是小写或大写,我想使数组中的字符串与字符串相同。

I also want to keep the array order as the same as the string but identify the string piece in array whether it is a keyword. 我也想保持数组顺序与字符串相同,但要确定数组中的字符串是否是关键字。

How could I get it? 我怎么能得到?

I would start by finding the indexes of all your keywords. 我将从查找所有关键字的索引开始。 From this you can make you can know where all the keywords in the sentence start and stop. 由此,您可以知道句子中所有关键字的开始和停止位置。 You can sort this by the index of where the keyword starts. 您可以按关键字起始位置的索引对其进行排序。

Then it's just a matter of taking substrings up to the start of the keywords -- these will be the keyword: false substrings, then add the keyword substring. 然后,只需将子字符串带到关键字的开头即可-这些将是keyword: false子字符串,然后添加关键字substring。 Repeat until you are done. 重复直到完成。

 const string = 'John Smith: I want to buy 100 apples\\r\\nI want to buy 200 oranges\\r\\n, and add 300 apples Thanks'; const keywords = ['John smith', '100', 'apples', '200', 'oranges', '300']; // find all indexes of a keyword function getInd(kw, arr) { let regex = new RegExp(kw, 'gi'), result, pos = [] while ((result = regex.exec(string)) != null) pos.push([result.index, result.index + kw.length]); return pos } // find all index of all keywords let positions = keywords.reduce((a, word) => a.concat(getInd(word, string)), []) positions.sort((a, b) => a[0] - b[0]) // go through the string and make the array let start = 0, res = [] for (let next of positions) { if (start + 1 < next[0]) res.push({ isKeyword: false,text: string.slice(start, next[0]).trim()}) res.push({isKeyword: true, text: string.slice(next[0], next[1])}) start = next[1] } // get any remaining text if (start < string.length) res.push({isKeyword: false, text: string.slice(start, string.length).trim()}) console.log(res) 

I'm trimming whitespace as I go, but you may want to do something different. 我正在修剪空白,但是您可能想要做一些不同的事情。

If you are willing to pick a delimiter 如果您愿意选择分隔符


Here's a much more succinct way to do this if you are willing to pick a set of delimiters that can't appear in your text for example, use {} below 如果您愿意选择一组不能出现在文本中的定界符,这是一种更为简洁的方法,例如,在下面使用{}

Here we simply wrap the keywords with the delimiter and then split them out. 在这里,我们只用分隔符将关键字包装起来,然后将它们分开即可。 Grabbing the keyword with the delimiter makes it easy to tell which parts of the split are your keywords: 使用定界符来抓紧关键字可以轻松分辨出拆分的哪些部分是您的关键字:

 const string = 'John Smith: I want to buy 100 apples\\r\\nI want to buy 200 oranges\\r\\n, and add 300 apples Thanks'; const keywords = ['John smith', '100', 'apples', '200', 'oranges', '300']; let res = keywords.reduce((str, k ) => str.replace(new RegExp(`(${k})`, 'ig'), '{$1}'), string) .split(/({.*?})/).filter(i => i.trim()) .map(s => s.startsWith('{') ? {iskeyword: true, text: s.slice(1, s.length -1)} : {iskeyword: false, text: s.trim()}) console.log(res) 

Use a regular expression 使用正则表达式

rx = new RegExp('('+keywords.join('|')+')')

thus 从而

str.split(rx)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM