简体   繁体   English

根据 javascript 中的括号拆分字符串

[英]Split string based on parentheses in javascript

I have a string like below.我有一个如下所示的字符串。
Sum(Height(In))

And i need to split above string like this.我需要像这样拆分上面的字符串。
Sum
Height(In)

I have tried following regex.我试过遵循正则表达式。 But i have no luck.但我没有运气。

/[ .:;?!~,`"&|^\((.*)\)$<>{}\[\]\r\n/\\]+/

Is there any way to achieve this?有什么办法可以做到这一点?

Thanks in advance.提前致谢。

You can match all up to the first ( and then all between that first ( and the ) that is at the end of the string, and use您可以匹配所有到第一个(然后所有在字符串末尾的第一个()之间的所有内容,并使用

 const [_, one, two] = "Sum(Height(In))".match(/^([^()]+)\((.*)\)$/); console.log(`The first value is: ${one}, the second is ${two}`);

See the regex demo .请参阅正则表达式演示 If the last ) is not at the end of string you can remove the $ end of string anchor.如果最后一个)不在字符串末尾,您可以删除字符串锚点的$结尾。 If there can be line breaks inside, replace .* with [\w\W]* .如果里面可能有换行符,请将.*替换为[\w\W]*

Regex details :正则表达式详细信息

  • ^ - start of string ^ - 字符串的开头
  • ([^()]+) - Group 1: one or more chars other than ( and ) ([^()]+) - 第 1 组:除()之外的一个或多个字符
  • \( - a ( char \( - 一个(字符
  • (.*) - Group 2: any zero or more chars other than line break chars, as many as possible ( * is greedy ) (.*) - 第 2 组:除换行符之外的任何零个或多个字符,尽可能多( *贪婪的)
  • \) - a ) char \) - a )字符
  • $ - end of string. $ - 字符串结束。

You can do it but in a limited way.你可以做到,但方式有限。 You need to fix the maximum number of parenthesis (the number of levels) to allow, as the unbounded case defines a language that is not regular.您需要修复允许的最大括号数(级别数),因为无界大小写定义了一种不规则的语言。 Regular expresions can accept regular languages (languages parseable by a limited grammar, called a regular grammar, or a finite state automaton) while the unbounded level parenthesis languages require a context free grammar (and the algorithm is normally implemented as a stack based automaton).正则表达式可以接受正则语言(可由有限语法解析的语言,称为正则语法或有限 state 自动机),而无界级别括号语言需要上下文无关语法(并且该算法通常实现为基于堆栈的自动机)。

The solution pointed to by the response by wiktor Sribizew would be valid if you are going to accept any expression that can have unbalanced parenthesis (more open paraenthesis than closed, or viceversa) If you want to close exactly after the parenthesis that matches the initial one, then you need a context free grammar parser.如果您要接受任何可能具有不平衡括号的表达式(开括号多于闭括号,反之亦然),则 wiktor Sribizew 的响应所指出的解决方案将是有效的,如果您想恰好在与初始括号匹配的括号之后结束,那么你需要一个上下文无关的语法分析器。 See below for an explanation of why.有关原因的解释,请参见下文。

In order to get the regular expression, you must express what can form the most internal level (at the highest nesting level) regular expression, something that cannot allow an open or a closed parenthesis (for this explanation, I will end at three levels of parenthesis, but you can expand it to more, the only requirement is that you must stop at some level, and have enough patience to do it, so I'm doing it at three levels only) Below is a regular expression that allows anything but a parenthesis:为了得到正则表达式,你必须表达什么可以形成最内部级别(在最高嵌套级别)的正则表达式,不能允许开括号或闭括号的东西(为了这个解释,我将在三个层次结束括号,但你可以扩展它更多,唯一的要求是你必须在某个级别停下来,并且有足够的耐心去做,所以我只在三个级别做)下面是一个正则表达式,它允许除了括号:

[^()]*

Let me call this expression L0 .让我称这个表达式L0 to allow a pair (or a sequence) of parenthesis that match... we can have a second regexp L1 formed as shown (the notation {L0} ---I'l put a pair of parenthesis around the braces to allow you to better see the operators in the regular expression--- means the regexp above):允许一对(或一个序列)匹配的括号......我们可以如图所示形成第二个正则表达式L1 (符号{L0} ---我将在大括号周围放置一对括号以允许您更好地查看正则表达式中的运算符---表示上面的正则表达式):

{L0} (\( {L0} \) {L0} )*

which means a secuence of L0 expressions interspersed with L0 expressions surrounded by a pair of parenthesis at each side.这意味着一个L0表达式的序列,其中散布着L0表达式,每边都有一对括号。 I'll expand {L0} only in this case to illustrate how the regular expression gets more and more complex at each stage (you can build this regular expression using a program and you'll get a very complex regular expression that parses a bounded number of nested parenthesis, very efficiently )我将仅在这种情况下扩展{L0}以说明正则表达式如何在每个阶段变得越来越复杂(您可以使用程序构建此正则表达式,您将获得一个解析有界数的非常复杂的正则表达式嵌套括号,非常有效

[^()]* (\( [^()]* \) [^()]* )*

(I left the spaces around the braces for readability purposes, but to use the regexp you need to eliminate all the embedded spaces in it) (为了便于阅读,我在大括号周围留了空格,但要使用正则表达式,您需要消除其中的所有嵌入空格)

This regular expression can be called L1 and will serve us to build the regular expression of level 2. This will be formed by the following sequence:这个正则表达式可以称为L1 ,它将为我们构建级别 2 的正则表达式提供服务。这将由以下序列形成:

{L1} (\( {L1} \) {L1} )*

where each of the {L1} is expanded with the regular expression we got above.其中每个{L1}都使用我们上面得到的正则表达式进行扩展。 This expression will be called L2 .这个表达式将被称为L2

After this, a pattern is seen, for a maximum of n levels, you will have to repeat this process, substituting the n-1 level expression on the Ln which is:在此之后,将看到一个模式,对于最多n级别,您将不得不重复此过程,将Ln上的n-1级表达式替换为:

{Ln-1} (\( {Ln-1} \) {Ln-1} )*

and this regular expression wil be called Ln .这个正则表达式将被称为Ln The total length of the regular expression multiplies by at least three times at each level of nesting, so you can expect that for eg six levels of parenthesis nesting your regexp will have around 6*3^(n) or aprox 4375 characters.正则表达式的总长度在每个嵌套级别至少乘以三倍,因此您可以预期,例如,对于六级括号嵌套,您的正则表达式将有大约6*3^(n)或大约 4375 个字符。 If you have a computer, you can use it to compute the regular expression, you can compile it and see how efficient it is (in one pass, with checking just one character at a time, you'll get if a parenthesized upto six levels of parenthesis matches)如果你有一台电脑,你可以用它来计算正则表达式,你可以编译它,看看它的效率如何(一次只检查一个字符,你会得到一个括号最多六个级别括号匹配)

To get above a few levels imposes a serious problem to the regexp and a context free grammar parser has to be used.超过几个级别会给正则表达式带来一个严重的问题,并且必须使用上下文无关的语法解析器。 It's common to parse JSON data structures that have over 10 levels of parenthesis and this would require a regexp of around 6*3^10 (or around 360k characters long) and this makes this approach non practical.解析具有超过 10 级括号的 JSON 数据结构是很常见的,这需要大约6*3^10 (或大约 360k 个字符长)的正则表达式,这使得这种方法不实用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM