简体   繁体   English

正则表达式 python 中的表达式

[英]Expression in regular expression python

I would like to make a regular expression for formatting a text, in which there can't be a { character except if it's coming with a backslash \ behind.我想制作一个正则表达式来格式化文本,其中不能有{字符,除非它后面带有反斜杠\ The problem is that a backslash can escape itself, so I don't want to match \\{ for example, but I do want \\\{ .问题是反斜杠可以自行转义,所以我不想匹配\\{例如,但我确实想要\\\{ So I want only an odd number of backslashs before a { .所以我只希望在{之前有奇数个反斜杠。 I can't just take it in a group and lookup the number of backslashs there are after like this:我不能只把它放在一个组中,然后像这样查找反斜杠的数量:

s = r"a wei\\\{rd thing\\\\\{"
matchs = re.finditer(r"([^\{]|(\\+)\{)+", s)
for match in matchs:
    if len(match.group(2)) / 2 == len(match.group(2)) // 2: # check if it's even
        continue
    do_some_things()

Because the group 2 can be used more than one time, so I can access only to the last one (in this case, \\\\\ ) It would be really nice if we could just do something like "([^\{]|(\\+)(?if len(\2) / 2 == len(\2) // 2)\{)+" as regular expression, but, as far as I know, that is impossible.因为组 2 可以多次使用,所以我只能访问最后一个(在本例中为\\\\\ )如果我们可以执行类似"([^\{]|(\\+)(?if len(\2) / 2 == len(\2) // 2)\{)+"作为正则表达式,但据我所知,这是不可能的。 How can I do then???那我该怎么办???

This matches an odd number of backslashes followed by a brace:这匹配奇数个反斜杠,后跟一个大括号:

(?<!\\)(\\\\)*(\\\{)

Breakdown:分解:

  • (?<!\\) - Not preceded by a backslash, to accommodate the next bit (?<!\\) - 前面没有反斜杠,以容纳下一位
    • This is called "negative lookbehind"这被称为“消极的后视”
  • (\\\\)* - Zero or more pairs of backslashes (\\\\)* - 零或多对反斜杠
  • (\\\{) - A backslash then a brace (\\\{) - 一个反斜杠,然后是一个大括号

Matches:火柴:

\{
\\\{
\\\\\{

Non-matches:不匹配:

\\{
\\\\{
\\\\\\{

Try it on RegExrRegExr上试试


This was partly inspired by Vadim Baratashvili 's answer这部分受到Vadim Baratashvili回答的启发

I think you can use this as solution: ([^\\](\\\\){0,})(\{)我认为您可以将用作解决方案: ([^\\](\\\\){0,})(\{)

We can check that between the last character that is not a backslash there are 0 or more pairs of backslashes and then goes {if part of the text matches the pattern, then we can replace it with the first group $1 (a character that is not a slash plus 0 or more pairs of slashes), so we will find and replace not escaped {.我们可以检查最后一个不是反斜杠的字符之间是否有 0 对或更多对反斜杠,然后执行 {如果部分文本与模式匹配,那么我们可以用第一组$1替换它(一个不是反斜杠的字符)一个斜杠加上 0 对或更多对斜杠),所以我们将查找并替换未转义的 {。

If we want to find escaped { we ca use this expression: ([^\\](\\\\){0,})(\\\{) - second group of match is \{如果我们想找到转义的 { 我们可以使用这个表达式: ([^\\](\\\\){0,})(\\\{) - 第二组匹配是\{

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM