[英]Regex Lookbehind in Javascript Alternatives
I'm trying to use the following regex in JS: 我正在尝试在JS中使用以下正则表达式:
(?<=@[AZ|az]+,)\\s|(?<=@[AZ|az]+,\\s[AZ|az]+)\\s(?=\\[[AZ|az]+\\])
which translates to: 转换为:
match all spaces which are preceded by : 匹配前面所有的空格:
@
AZ
or az
AZ
或az
范围内的任意数量的字符 OR 要么
match all spaces which are preceded by: 匹配以下所有空格:
@
followed by any number of characters in the range AZ
or az
后跟
AZ
或az
范围内的任意数量的字符
AZ
or az
AZ
或az
范围内的任意数量的字符 AND are succeeded by: AND由:
[
AZ
or az
AZ
或az
范围内的任意数量的字符 ]
However, JS doesn't support lookbehind. 但是,JS不支持后向。 Is there any alternative for supporting the above regex in JS or any npm library I can use instead?
是否可以在JS或任何我可以使用的npm库中支持上述正则表达式的替代方法?
So if we have a sentence like 所以如果我们有这样的句子
Hi my name is @John, Doe [Example] and I am happy to be here
that should become Hi my name is @John, Doe [Example] and I am happy to be here
,应该成为
Hi my name is @John,Doe[Example] and I am happy to be here
. Hi my name is @John,Doe[Example] and I am happy to be here
。
Also, if we have something like 另外,如果我们有类似
Hi my name is @John, Smith Doe [Example]
, that should become Hi my name is @John, Smith Doe [Example]
,应该变成
Hi my name is @John,SmithDoe[Example]
. Hi my name is @John,SmithDoe[Example]
。
I've updated my answer on new input 我已根据新输入内容更新了答案
console.clear(); var inputEl = document.querySelector('#input') var outputEl = document.querySelector('#output') function rep (e) { var input = e.target.value; var reg = /@([az]+?\\s*?)+,(\\s+[az]+)+(\\s\\[[az]+\\])?/gim matches = input.match(reg); var output = input; if (matches) { replaceMap = new Map() for (var i = 0; i < matches.length; i++) { var m = matches[i] .replace(/\\[/, '\\\\[') .replace(/\\]/, '\\\\]') replaceMap.set(m, matches[i].replace(/\\s+/gm, '')) } for (var [s,r] of replaceMap) { output = output.replace(new RegExp(s, 'gm'), r) } } outputEl.textContent = output } inputEl.addEventListener('input', rep) inputEl.dispatchEvent(new Event('input'))
textarea { width: 100%; min-height: 100px; }
<h3>Input</h3> <textarea id="input">@Lopez de la Cerda, Antonio Gabriel Hugo David [Author]. I'm the father of @Marquez, Maria</textarea> <h3>Output (initially empty)</h3> <p id="output"></p> <h3>Expected result (on initial input)</h3> <p>@LopezdelaCerda,AntonioGabrielHugoDavid[Author]. I'm the father of @Marquez,Maria</p>
It works at least in Chrome with this regex: 此正则表达式至少在Chrome中有效:
/(?<=@[a-z]+,)\s+(?![a-z]+\s+\[[a-z]+\])|(?<=(@[a-z]+,\s[a-z]+))\s+(?=\[[a-z]+\])/gmi
See: https://regex101.com/r/elTkRe/4 参见: https : //regex101.com/r/elTkRe/4
But you can't use it in PCRE because it is not allowed to have quantifiers in lookbehinds. 但是您不能在PCRE中使用它,因为它不允许在后面使用量词。 They must be of fixed width.
它们必须具有固定的宽度。 See the errors to the right here: https://regex101.com/r/ZC3XmX/2
在此处查看右侧的错误: https : //regex101.com/r/ZC3XmX/2
console.clear(); var reg = /(@[A-Za-z]+,\\s[A-Za-z]+)(\\s+)(\\[[A-Za-z]+\\])|(@[AZ|az]+,)(\\s+)/gm var probes = [ '@gotAMatch, <<<', '@LongerWithMatch, <<<', '@MatchHereAsWell, <<<', '@Yup, <<<<', '@noMatchInThisLine,<<<<<', '@match, match [match]<<<<<<<', '@ noMatchInThisLine, <<<<' ] for (var i in probes) { console.log(probes[i].replace(reg, '$1$3$4')) }
.as-console-wrapper { max-height: 100% !important; top: 0; }
What you need to do is converting lookbehinds to capturing groups in order to include them in replacement string (note that case-insensitive flag ( i
) is set): 您需要执行的操作是将lookbehinds转换为捕获组,以便将它们包括在替换字符串中(请注意,设置了不区分大小写的标志(
i
)):
(@[a-z]+,)([\t ]*([a-z]+)[\t ]*(?=\[[a-z]+\])|[\t ]+)
Replace with $1$3
if you want to remove those spaces. 如果要删除这些空格,请替换为
$1$3
。
See live demo here 在这里观看现场演示
Just update your Node.js version. 只需更新您的Node.js版本。 Lookbehind assertions are part of ECMAScript 2018 and are already implemented in Chromium and Node.js.
向后断言是ECMAScript 2018的一部分,已经在Chromium和Node.js中实现。 According to http://kangax.github.io/compat-table/es2016plus/ , Chromium 70 and Node.js 8.10 have this feature.
根据http://kangax.github.io/compat-table/es2016plus/,Chromium 70和Node.js 8.10具有此功能。
I just tested it in my browser and in Node.js (v8.11) and can confirm that: 我刚刚在浏览器和Node.js(v8.11)中对其进行了测试,并可以确认:
node -e "console.log('nothing@xyz, bla'.match(/(?<=@[A-Za-z]+,)\s+/))"
If you can't update you have to use other strategies like capture and replace which should not be a big issue with a positive lookbehind (negatives are harder): 如果您无法更新,则必须使用其他策略,例如捕获和替换,这对后面的问题来说应该不是一个大问题(否定因素更难解决):
const hit = 'nothing@xyz, bla'.match(/(@[A-Za-z]+,)\s+/)
hit[0].replace(hit[1])
If nothing else works, take a look at this project which tries to implement Lookbehind (I haven't tested it): http://blog.stevenlevithan.com/archives/javascript-regex-lookbehind 如果没有其他效果,请看一下这个试图实现Lookbehind的项目(我尚未测试过): http : //blog.stevenlevithan.com/archives/javascript-regex-lookbehind
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.