简体   繁体   English

使用 ws 和特殊字符拆分 Javascript 字符串; 保持角色的机智

[英]Split Javascript string using ws and special characters; keeping the characters in tact

I know there are tons of questions on this topic but I can't seem to produce a RegEx that works.我知道关于这个主题有很多问题,但我似乎无法生成一个有效的 RegEx。 I would like to split a string on whitespace AND a special character but keeping them intact.我想在空格和特殊字符上分割一个字符串,但保持它们完好无损。

I have the following string: "tobe " + '\x02' + " or nottob " and would like the output of the expression to produce the following: ["tobe", " ","\x02"," ","or"," ","notob"," "]我有以下字符串: "tobe " + '\x02' + " or nottob "并且希望表达式的 output 产生以下内容: ["tobe", " ","\x02"," ","or"," ","notob"," "]

\x02 is a special block character and I'm using it as a placeholder. \x02是一个特殊的块字符,我将它用作占位符。

EDIT I have tried "tobe " + '\x02' + " or not".sentence.split(/(\S+\s+)/);编辑我试过"tobe " + '\x02' + " or not".sentence.split(/(\S+\s+)/); but end up with "tobe or nottob "但以"tobe or nottob "

Thanks谢谢

You need to match each string and separate the groups您需要匹配每个字符串并分隔组
I created a formula that does exactly this.我创建了一个可以做到这一点的公式。 ([az\\0-9]+)|(\s+)
This matches every word that contains az 0-9 or \这匹配包含az 0-9\的每个单词
Then it creates a new group and matches whitespace然后它创建一个新组并匹配空格
It outputs an array of all of them.它输出所有这些的数组。 Including the whitespace包括空格

Also, if you match using /formula/gi, it will match G lobally and I nsensitive, so you can match az or AZ without the extra characters in the formula.此外,如果您使用 /formula/gi 进行匹配,它将匹配G lobally 和I nsensitive,因此您可以匹配 az 或 AZ 而无需公式中的额外字符。

Edit: A shorter method, is to use the Not-Whitespace Selector (\S) which will select all characters EXCEPT whitespace.编辑:一种更短的方法是使用非空白选择器 (\S),它将 select 除了空白之外的所有字符。 SO, you can use (\S+|\s+) which will select every nonspace set and every set of spaces.所以,你可以使用(\S+|\s+)它将 select 每个非空间集和每个空间集。 Its shorter, but you cant select which characters you want to save, itll just save everything that isnt a space, along with everything that IS a space.它更短,但你不能 select 你想保存哪些字符,它只会保存所有不是空格的内容,以及所有是空格的内容。

The reason you cant match the \ is because its an escape character, and you need \.你不能匹配\的原因是因为它是一个转义字符,你需要\。 Ive included this in the live JS example aswell.我也将它包含在实时 JS 示例中。

However, when using console.log, it doesnt need escaping但是,使用console.log时,不需要escaping

Live Javascript Example:直播 Javascript 示例:

 let matchString = "tobe \\x02 or nottob "; let firstMatch = matchString.match(/[az\\0-9]+|\s+/gi); console.log(firstMatch); let secondMatch = matchString.match(/\S+|\s+/gi); console.log(secondMatch);

if you want to use regex:如果你想使用正则表达式:

var str = "tobe   " + '\x02' + " or nottob    ";

var out = str.split(/(\s+|\\x02)/);

console.log(out);

[EDIT] [编辑]

 var str = "tobe " + '\x02' + " or nottob "; var out = str.split(/(\s+|\\x02)/); out = out.filter(Boolean); console.log(out);

( .filter(Boolean) to remove empty item) .filter(Boolean)删除空项目)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM