简体   繁体   English

正则表达式捕获和替换完全匹配,可能包含特殊字符的表达式

[英]RegEx to capture and replace exact matches, with expressions that might contain special characters

I am trying to create some sort of translation egine.我正在尝试创建某种翻译引擎。 I am given a mapping of attributes, and using this mapping I need to translate an expression.我得到了一个属性映射,我需要使用这个映射来翻译一个表达式。

For example, I have this mapping:例如,我有这个映射:

{
  "a": "A",
  "b[*]": "B[*]",
  "c": "C",
  "ac": "D"
}

And an example input might be (the rule is that each token can be followed by a . , it's actually a JSONPath):一个示例输入可能是(规则是每个标记后面可以跟一个. ,它实际上是一个 JSONPath):

ab[*] should translate to AB[*] ab[*]应转换为AB[*]

And each input can appear arbitraritly in the expression, for example I can have a mix: a.c.b[*] => A.C.B[*]并且每个输入都可以在表达式中任意出现,例如我可以混合: a.c.b[*] => A.C.B[*]

My solution was to create a list of Regexes out of this mapping, and looping that, searching and replacing the expression for each regex.我的解决方案是从这个映射中创建一个正则表达式列表,然后循环它,搜索和替换每个正则表达式的表达式。

The problem was in inputs like that: ac => should have translated to D but instead, since there exist mapping for a and c , their regexes match and I get AC instead.问题出在这样的输入中: ac => 应该已经转换为D但是,因为存在ac的映射,所以它们的正则表达式匹配,我得到了AC

I thought to use word boundaries \b, but it didn't work well in cases there were special chars, like in the b[*] example, as it's not included in the word boundary.我想使用单词边界 \b,但在有特殊字符的情况下效果不佳,例如b[*]示例,因为它不包含在单词边界中。

I also thought to extend the boundary, but nothing worked as expected.我还想扩展边界,但没有按预期工作。

In the bottom line: is there a way to replace a stirng by another, considering that only a full match is valid, but an expression can be part of a JSONPath string?归根结底:考虑到只有完全匹配有效,但表达式可以是 JSONPath 字符串的一部分,有没有办法用另一个替换搅拌?

We can try building a regex alternation of the lookup keys.我们可以尝试构建查找键的正则表达式交替。 Then, use replace() with a callback function to make the replacements.然后,使用带有回调 function 的replace()进行替换。 As the comment by @Nick above mentions, the alternation should place longer substrings first.正如上面@Nick 的评论所提到的,交替应该首先放置更长的子字符串。

 var map = { "a": "A", "b[*]": "B[*]", "c": "C", "ac": "D" }; var input = "a.c.b[*]"; var output = input.replace(/(ac|a|c|b\[\*\])/g, (x, y) => map[y]); console.log(output);

This approach also avoids an ugly explicit loop.这种方法还避免了丑陋的显式循环。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM