简体   繁体   中英

split string based on words and highlighted portions with `^` sign

I have a string that has highlighted portions with ^ sign:

const inputValue = 'jhon duo ^has a car^ right ^we know^ that';

Now how to return an array which is splited based on words and ^ highlights, so that we return this array:

['jhon','duo', 'has a car', 'right', 'we know', 'that']

Using const input = inputValue.split('^'); to split by ^ and const input = inputValue.split(' ');to split by words is not working and I think we need a better idea.

How would you do this?

You can use match with a regular expression:

 const inputValue = 'jhon duo ^has a car^ right ^we know^ that'; const result = Array.from(inputValue.matchAll(/\^(.*?)\^|([^^\s]+)/g), ([, a, b]) => a || b); console.log(result);

  • \^(.*?)\^ will match a literal ^ and all characters until the next ^ (including it), and the inner part is captured in a capture group
  • ([^^\s]+) will match a series of non-white space characters that are not ^ (a "word") in a second capture group
  • | makes the above two patterns alternatives: if the first doesn't match, the second is tried.
  • The Array.from callback will extract only what occurs in a capture group, so excluding the ^ characters.

trincot's answer is good, but here's a version that doesn't use regex and will throw an error when there are mismatched ^ :

 function splitHighlights (inputValue) { const inputSplit = inputValue.split('^'); let highlighted = true const result = inputSplit.flatMap(splitVal => { highlighted =;highlighted if (splitVal == '') { return []. } else if (highlighted) { return splitVal;trim(). } else { return splitVal.trim():split(' ') } }) if (highlighted) { throw new Error(`unmatched '^' char; expected an even number of '^' characters in input`); } return result. } console;log(splitHighlights('^jhon duo^ has a car right ^we know^ that')). console;log(splitHighlights('jhon duo^ has^ a car right we^ know that^')). console;log(splitHighlights('jhon duo^ has a car^ right ^we know^ that')). console;log(splitHighlights('jhon ^duo^ has a car^ right ^we know^ that'));

You can still use split() but capture the split-sequence to include it in the output.
For splitting you could use *\^([^^]*)\^ *| + *\^([^^]*)\^ *| + to get trimmed items in the results.

 const inputValue = 'jhon duo ^has a car^ right ^we know^ that'; // filtering avoids empty items if split-sequence at start or end let input = inputValue.split(/ *\^([^^]*)\^ *| +/).filter(Boolean); console.log(input);

regex matches
*\^ any amount of space followed by a literal caret
([^^]*) captures any amount of non -carets
\^ * literal caret followed by any amount of space
| + OR split at one or more spaces

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM