简体   繁体   English

如何获取字符串中包含嵌套括号的括号中的所有值?

[英]How to get all values in parentheses in a string including nested parentheses?

Desired Behaviour 期望的行为

I have an input validation that, amongst other things, tests for length ( < 140 chars ). 我有一个输入验证,除其他事项外,它测试长度( < 140 chars )。

My input accepts markdown, and I'd like to exclude the length of the URLs in my length calculation. 我的输入接受减价,我想在长度计算中排除 URL的长度。

For example, something that appears as: 例如,显示为:

here is a very long link to this article on Math.random() 这是与Math.random()上的这篇文章的非常长的链接

is 57 characters long, whereas the actual code for it is 155 characters long, ie: 长度为57字符,而实际代码为155字符,即:

here is a very long link to this article on [Math.random()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random)

The scenarios that need to be covered are things like: 需要涵盖的场景包括:

text and [a markdown link](https://google.com)

text (and [a markdown link within parenthesis](https://google.com))

This question is about: 这个问题是关于:

How to get all values in parentheses in a string including nested parentheses. 如何获取字符串中包含嵌套括号的括号中的所有值。

What I've Tried 我尝试过的

My current approach to the overall problem is: 我目前对整体问题的处理方法是:

  1. get all values within parentheses in the string 获取字符串中括号内的所有值
  2. if any start with https , create a copy of the string 如果以https开头,则创建字符串的副本
  3. remove the values from the copied string 从复制的字符串中删除值
  4. get the length of the adjusted string and run length validation on that 获取调整后的字符串的长度,并在其上进行游程长度验证

These are my attempts at the first part: 这些是我在第一部分中所做的尝试:

01) 01)

This solution just gets the first "match", source: https://stackoverflow.com/a/12059321 此解决方案仅获得第一个“匹配项”,来源: https : //stackoverflow.com/a/12059321

 var text = "here is a (very) long link to this article on [Math.random()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random)"; var regExp = /\\(([^)]+)\\)/; var matches = regExp.exec(text); console.log(matches); // 0: "(very)" // 1: "very" 

02) 02)

This solution gets all matches, with the parenthesis' included, source: https://stackoverflow.com/a/30674943 此解决方案获取所有匹配项,包括括号在内,来源: https : //stackoverflow.com/a/30674943

 var text = "here is a (very) long link to this article on [Math.random()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random)"; var regExp = /(?:\\()[^\\(\\)]*?(?:\\))/g; var matches = text.match(regExp); console.log(matches); // 0: "(very)" // 1: "()" // 2: "(https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random)" 

But it doesn't work as expected in the nested parenthesis scenario, ie: 但它在嵌套括号方案中无法按预期工作,即:

 var text = "text (and [a markdown link within parenthesis](https://google.com))"; var regExp = /(?:\\()[^\\(\\)]*?(?:\\))/g; var matches = text.match(regExp); console.log(matches); // ["(https://google.com)"] 

03) 03)

There is a php regex solution here that seems to be related: 这里有一个php正则表达式解决方案似乎与之相关:

https://stackoverflow.com/a/12994041 https://stackoverflow.com/a/12994041

but i couldn't figure out how to implement that regex in javascript, ie: 但我不知道如何在javascript中实现该正则表达式,即:

preg_match_all('/^\\((.*)\\)[ \\t]+\\((.*)\\)$/', $s, $matches);

Try (?<=\\()[^()]+(?=\\)) 试试(?<=\\()[^()]+(?=\\))

Explanation: 说明:

(?<=\\() - assert with positive lookbehind that what preceds is ( (?<=\\() -以肯定的眼光断言什么是(

[^()]+ - match one or more of any characters other than ( and ) [^()]+ -匹配()以外的任意一个或多个字符

(?=\\)) - assert with positive lookahead that what follows is ) (?=\\)) -积极肯定地断言接下来是)

Demo 演示版

I would use a regular expression that also requires the part in square brackets to precede the link that's within parentheses. 我将使用正则表达式,该正则表达式也要求方括号中的部分位于括号内的链接之前。

/\[([^\]]+)\]\([^)]+\)/g

Make sure to use the g flag. 确保使用g标志。 This also includes a capture group so you can differentiate the "visible" part (between square brackets) from the rest that is "invisible": 这还包括一个捕获组,因此您可以将“可见”部分(在方括号之间)与其余“不可见”部分区分开:

 var text = "here is a (very) long link to this article on [Math.random()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random)"; var regExp = /\\[([^\\]]+)\\]\\([^)]+\\)/g; var match; while (match = regExp.exec(text)) { console.log("full match: " + match[0]); console.log("keep: " + match[1]); } 

You can actually use a replace call to remove the "invisible" part. 实际上,您可以使用replace调用来删除“不可见”部分。 That makes it easy to calculate the total number of visible characters: 这样可以轻松计算可见字符的总数:

 var text = "here is a (very) long link to this article on [Math.random()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/random)"; var regExp = /\\[([^\\]]+)\\]\\([^)]+\\)/g; console.log("original length: " + text.length); console.log("visible length: " + text.replace(regExp, "$1").length); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM