如何在 python 中使用正则表达式忽略以特定模式开头的字符串？

Question

Accept and return @something but reject first@last.接受并返回@something 但拒绝first@last。

r'@([A-Z][A-Z0-9_]*[A-Z0-9])

The above regexp will accept @something (starts with letter, ends with letter or number, may have underscore in middle, atleast 2 characters long) and returns the part after the @ symbol.上面的正则表达式将接受@something（以字母开头，以字母或数字结尾，中间可能有下划线，至少2个字符）并返回@符号后面的部分。

I do not want to return strings which contain some letters or number A-Z0-9 before the @ symbol.我不想在@符号之前返回包含一些字母或数字A-Z0-9的字符串。

Spaces, new lines, special characters, etc before @ is allowed.允许@之前的空格、换行符、特殊字符等。

CODE:代码：

re.findall(r'@([A-Z][A-Z0-9_]*[A-Z0-9])', text, re.I)

Answer 1

Use利用

re.findall(r'(?<![A-Z0-9])@([A-Z][A-Z0-9_]*[A-Z0-9])', text, re.I)

See regex proof .请参阅正则表达式证明。

EXPLANATION解释

--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    [A-Z0-9]                 any character of: 'A' to 'Z', '0' to '9'
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  @                        '@'
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [A-Z]                    any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
    [A-Z0-9_]*               any character of: 'A' to 'Z', '0' to
                             '9', '_' (0 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
    [A-Z0-9]                 any character of: 'A' to 'Z', '0' to '9'
--------------------------------------------------------------------------------
  )                        end of \1

Answer 2

You can use您可以使用

\B@([A-Z][A-Z0-9_]*[A-Z0-9])

The pattern matches:模式匹配：

\B Assert a position where a word boundary does not match \B断言一个字边界不匹配的 position
@ Match literally @字面上匹配
( Capture group 1 (捕获组 1
- [AZ][A-Z0-9_]*[A-Z0-9]
) Close group 1 )关闭第 1 组

Regex demo正则表达式演示

import re

text = "Accept and return @something but reject first@last."
print(re.findall(r'\B@([A-Z][A-Z0-9_]*[A-Z0-9])', text, re.I))

Output Output

['something']

如何在 python 中使用正则表达式忽略以特定模式开头的字符串？

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-06-10 22:12:08

解决方案2
1 2021-06-10 22:50:45

如何在 python 中使用正则表达式忽略以特定模式开头的字符串？

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-06-10 22:12:08

解决方案2 1 2021-06-10 22:50:45

解决方案1
1 已采纳 2021-06-10 22:12:08

解决方案2
1 2021-06-10 22:50:45