简体   繁体   English

正则表达式在Javascript替代方法中落后

[英]Regex Lookbehind in Javascript Alternatives

I'm trying to use the following regex in JS: 我正在尝试在JS中使用以下正则表达式:

(?<=@[AZ|az]+,)\\s|(?<=@[AZ|az]+,\\s[AZ|az]+)\\s(?=\\[[AZ|az]+\\])

which translates to: 转换为:

match all spaces which are preceded by : 匹配前面所有的空格:

  • @
  • followed by any number of characters in the range AZ or az 后跟AZaz范围内的任意数量的字符
  • followed by a comma 跟一个逗号

OR 要么

match all spaces which are preceded by: 匹配以下所有空格:

  • @

  • followed by any number of characters in the range AZ or az 后跟AZaz范围内的任意数量的字符

  • followed by a comma 后面跟一个逗号
  • followed by a space 跟一个空格
  • followed by any number of characters in the range AZ or az 后跟AZaz范围内的任意数量的字符

AND are succeeded by: AND由:

  • [
  • followed by any number of characters in the range AZ or az 后跟AZaz范围内的任意数量的字符
  • ]

However, JS doesn't support lookbehind. 但是,JS不支持后向。 Is there any alternative for supporting the above regex in JS or any npm library I can use instead? 是否可以在JS或任何我可以使用的npm库中支持上述正则表达式的替代方法?

So if we have a sentence like 所以如果我们有这样的句子
Hi my name is @John, Doe [Example] and I am happy to be here that should become Hi my name is @John, Doe [Example] and I am happy to be here ,应该成为
Hi my name is @John,Doe[Example] and I am happy to be here . Hi my name is @John,Doe[Example] and I am happy to be here
Also, if we have something like 另外,如果我们有类似
Hi my name is @John, Smith Doe [Example] , that should become Hi my name is @John, Smith Doe [Example] ,应该变成
Hi my name is @John,SmithDoe[Example] . Hi my name is @John,SmithDoe[Example]

I've updated my answer on new input 我已根据新输入内容更新了答案

 console.clear(); var inputEl = document.querySelector('#input') var outputEl = document.querySelector('#output') function rep (e) { var input = e.target.value; var reg = /@([az]+?\\s*?)+,(\\s+[az]+)+(\\s\\[[az]+\\])?/gim matches = input.match(reg); var output = input; if (matches) { replaceMap = new Map() for (var i = 0; i < matches.length; i++) { var m = matches[i] .replace(/\\[/, '\\\\[') .replace(/\\]/, '\\\\]') replaceMap.set(m, matches[i].replace(/\\s+/gm, '')) } for (var [s,r] of replaceMap) { output = output.replace(new RegExp(s, 'gm'), r) } } outputEl.textContent = output } inputEl.addEventListener('input', rep) inputEl.dispatchEvent(new Event('input')) 
 textarea { width: 100%; min-height: 100px; } 
 <h3>Input</h3> <textarea id="input">@Lopez de la Cerda, Antonio Gabriel Hugo David [Author]. I'm the father of @Marquez, Maria</textarea> <h3>Output (initially empty)</h3> <p id="output"></p> <h3>Expected result (on initial input)</h3> <p>@LopezdelaCerda,AntonioGabrielHugoDavid[Author]. I'm the father of @Marquez,Maria</p> 

Backup of old answer content (for historical reasons) 备份旧答案内容(出于历史原因)

It works at least in Chrome with this regex: 此正则表达式至少在Chrome中有效:

/(?<=@[a-z]+,)\s+(?![a-z]+\s+\[[a-z]+\])|(?<=(@[a-z]+,\s[a-z]+))\s+(?=\[[a-z]+\])/gmi

See: https://regex101.com/r/elTkRe/4 参见: https : //regex101.com/r/elTkRe/4

But you can't use it in PCRE because it is not allowed to have quantifiers in lookbehinds. 但是您不能在PCRE中使用它,因为它不允许在后面使用量词。 They must be of fixed width. 它们必须具有固定的宽度。 See the errors to the right here: https://regex101.com/r/ZC3XmX/2 在此处查看右侧的错误: https : //regex101.com/r/ZC3XmX/2

Solution without look behinds and look aheads 无后顾之忧的解决方案

 console.clear(); var reg = /(@[A-Za-z]+,\\s[A-Za-z]+)(\\s+)(\\[[A-Za-z]+\\])|(@[AZ|az]+,)(\\s+)/gm var probes = [ '@gotAMatch, <<<', '@LongerWithMatch, <<<', '@MatchHereAsWell, <<<', '@Yup, <<<<', '@noMatchInThisLine,<<<<<', '@match, match [match]<<<<<<<', '@ noMatchInThisLine, <<<<' ] for (var i in probes) { console.log(probes[i].replace(reg, '$1$3$4')) } 
 .as-console-wrapper { max-height: 100% !important; top: 0; } 

What you need to do is converting lookbehinds to capturing groups in order to include them in replacement string (note that case-insensitive flag ( i ) is set): 您需要执行的操作是将lookbehinds转换为捕获组,以便将它们包括在替换字符串中(请注意,设置了不区分大小写的标志( i )):

(@[a-z]+,)([\t ]*([a-z]+)[\t ]*(?=\[[a-z]+\])|[\t ]+)

Replace with $1$3 if you want to remove those spaces. 如果要删除这些空格,请替换为$1$3

See live demo here 在这里观看现场演示

Just update your Node.js version. 只需更新您的Node.js版本。 Lookbehind assertions are part of ECMAScript 2018 and are already implemented in Chromium and Node.js. 向后断言是ECMAScript 2018的一部分,已经在Chromium和Node.js中实现。 According to http://kangax.github.io/compat-table/es2016plus/ , Chromium 70 and Node.js 8.10 have this feature. 根据http://kangax.github.io/compat-table/es2016plus/,Chromium 70和Node.js 8.10具有此功能。

I just tested it in my browser and in Node.js (v8.11) and can confirm that: 我刚刚在浏览器和Node.js(v8.11)中对其进行了测试,并可以确认:

node -e "console.log('nothing@xyz, bla'.match(/(?<=@[A-Za-z]+,)\s+/))"

If you can't update you have to use other strategies like capture and replace which should not be a big issue with a positive lookbehind (negatives are harder): 如果您无法更新,则必须使用其他策略,例如捕获和替换,这对后面的问题来说应该不是一个大问题(否定因素更难解决):

const hit = 'nothing@xyz, bla'.match(/(@[A-Za-z]+,)\s+/)
hit[0].replace(hit[1])

If nothing else works, take a look at this project which tries to implement Lookbehind (I haven't tested it): http://blog.stevenlevithan.com/archives/javascript-regex-lookbehind 如果没有其他效果,请看一下这个试图实现Lookbehind的项目(我尚未测试过): http : //blog.stevenlevithan.com/archives/javascript-regex-lookbehind

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM