简体   繁体   English

正则表达式:在字符串末尾匹配一个在否定集中不匹配的字符仅一定次数

[英]Regex: Match a character that is unmatched in a negated set only a certain number of times at the end of a string

In Javascript, I want to match (ie add to the result) a parenthesis ")" if it appears twice repeatedly at the end of the string, and unmatch it if it either appears once or more than two times.在 Javascript 中,如果括号“)”在字符串末尾重复出现两次,我想匹配(即添加到结果中),如果出现一次或两次以上,则不匹配。 Probably the answer is to remove the parenthesis from the negated set and "use it somewhere else in a different way" — thus, I have attempted adapting this approach, without any success.答案可能是从否定集合中删除括号并“以不同的方式在其他地方使用它”——因此,我尝试调整这种方法,但没有任何成功。 Though, my regex is fairly big and tricky, so certain regex expressions "become useless" — from what I can tell from my experience, as I am not that experienced with regexes.虽然,我的正则表达式相当大而且很棘手,所以某些正则表达式“变得无用”——从我的经验中可以看出,因为我对正则表达式没有那么丰富的经验。 So, here's my regex:所以,这是我的正则表达式:

/(?<?[a-zA-Z/]+)(https:?),\/{1?2}[a-zA-Z]\S*(?<=\){2}|(:<=[^;"'\]),,]))/g

See, my efforts on accomplishing this are "at one step of success" as my current regex matches the parenthesis when it appears twice, but it doesn't when it shows once, but continues if it appears more than two times.看,我实现这一目标的努力是“成功的一步”,因为我当前的正则表达式在出现两次时匹配括号,但在出现一次时不匹配,但如果出现两次以上则继续。 To put an example:举个例子:


Example URL: https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts"))示例 URL: https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts"))

Results:结果:

[...] &nav=(("fonts") - doesn't match ") - good [...] &nav=(("fonts") - 不匹配") - 好

[...] &nav=(("fonts")) - matches )) - good [...] &nav=(("fonts")) - 匹配)) - 好

[...] &nav=(("fonts"))) [...] - matches ))) but does not with unwanted characters on the negated set - (kind of) bad [...] &nav=(("fonts"))) [...] - 匹配)))但在否定集上不包含不需要的字符 - (有点)不好

..and so on... ..等等...


I have attempted through different lookarounds and quantifiers... "mixes" and ways, and have accomplished no better success than the regex I have written previously.我尝试了不同的外观和量词......“混合”和方式,并且没有比我之前编写的正则表达式取得更好的成功。

By the way, I don't want to use the beginning ( ^ ) and end ( $ ) characters on the regex — as I am using it on big and variate scripts, and thus I am using a global context;顺便说一句,我不想在正则表达式上使用开始( ^ )和结束( $ )字符——因为我在大脚本和变量脚本中使用它,因此我使用的是全局上下文; probably I am mistaken on this statement, so correct me if necessary — but if they are required — as I have tried on more simple regexes — I will not concern too much.可能我在这个陈述上是错误的,所以如果有必要请纠正我——但如果他们是必需的——因为我已经尝试过更简单的正则表达式——我不会太在意。

As Wiktor Stribiżew requested, here is the expected behavior of the regex with the aformentioned example:正如Wiktor Stribiżew所要求的,这是带有上述示例的正则表达式的预期行为:


Expected results:预期成绩:

https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts") - should match https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts") - 应该匹配https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts

https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts")) - should match all the URL (the original URL) https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts")) - 应该匹配所有的 URL (原始 URL)

https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts"))) - should match https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts"))) - 应该匹配https://docs.google.com/picker?protocol=gadgets [...] &nav=(("fonts

It seems you can use看来你可以使用

(?<![a-zA-Z/])(?:https?:)?\/{1,2}[a-zA-Z]\S*(?:[^\s:"'\\)]|(?<!\))\)\)(?!\S))

Or, to account for any non-word chars,或者,要考虑任何非单词字符,

(?<![a-zA-Z/])(?:https?:)?\/{1,2}[a-zA-Z]\S*(?:\b|(?<!\))\)\)(?!\S))

See the regex demo .请参阅正则表达式演示 Details :详情

  • (?<![a-zA-Z/]) - a negative lookbehind that fails the match if there is a letter or / immediately to the left of the current location (?<![a-zA-Z/]) - 如果当前位置左侧有一个字母或/则匹配失败
  • (?:https?:)? - an optional http: or https: string - 可选http:https:字符串
  • \/{1,2} - one or two / s \/{1,2} - 一个或两个/ s
  • [a-zA-Z] - a letter [a-zA-Z] - 一个字母
  • \S* - zero or more non-whitespaces \S* - 零个或多个非空格
  • (?:\b|(?<?\))\)\)(?!\S)) - either a word boundary or a )) string not preceded by another ) and not directly followed with a non-whitespace char. (?:\b|(?<?\))\)\)(?!\S)) - 一个单词边界或一个))字符串,前面没有另一个)并且没有直接跟非空白字符。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM