简体   繁体   English

正则表达式仅在某些字符上分割

[英]Regex split only on certain characters

I have the following text that I need to split... 我需要分割以下文本...

'(!false =>stuff <300^ OR <=200 "TEST DATA")'

There are a couple of rules. 有两个规则。 I need to preserve quoted texts. 我需要保留引用的文本。 Also, the delimiters I need to split on are the following... 另外,我需要分割的定界符如下:

{'<', '>', '<=', '=>', '=', '!', '(', ')'}

In this case, my split is the following... 在这种情况下,我的拆分如下:

['(', '!', 'false', '=>', 'stuff', '<', '300^', 'OR', '<=', '200', '"TEST DATA"', ')']

I've gotten close... 我已经接近了...

input_text.match(/"[^"]*"|=[<>]|[<>]=|[<>]|[!]|[=]|[()]|\w+/g);

It works for the most part, except for one thing, character such as ^ are not kept. 它在大多数情况下都起作用,除了一件事之外,不保留诸如^类的字符。 So instead of getting... 所以没有得到...

300^

I'm getting just... 我得到...

300

How can I keep every string intact and only split of the delimiters mentioned? 如何保持每个字符串完整无缺,并且只拆分提及的分隔符?

It sounds like when you match \\w+ , you also want to match ^ s in that same matched substring, so make a character set and include ^ in that character set, as well as \\w : 听起来像当您匹配\\w+ ,您还想要匹配同一匹配的子字符串中的^ ,因此创建一个字符集并在该字符集中包含^以及\\w

 const input_text = '(!false =>stuff <300$$^300 OR <=200 "TEST DATA")'; console.log( input_text.match(/"[^"]*"|=[<>]|[<>]=|[<>]|[!]|[=]|[()]|[\\w^$]+/g) // ^^^^^ ); 

If all but the last alternations of the regular expression take care of all the special cases, then another option is to, instead of matching word (and selected special) characters finally, you could match anything but a whitespace character (the initial alternations will take priority, if any match): 如果正则表达式的除最后一个以外的所有字符都处理了所有特殊情况,那么另一个选择是, 除了最后一个匹配单词 (和选定的特殊字符)的字符外,您还可以匹配除空格字符之外的任何其他字符 (初始替换将采用优先级(如果有)):

 const input_text = '(!false =>stuff <300$$^300 OR <=200 "TEST DATA")'; console.log( input_text.match(/"[^"]*"|=[<>]|[<>]=|[<>]|[!]|[=]|[()]|[^\\s]+/g) // ^^^^^ ); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM