简体   繁体   English

如何在 python 中使用正则表达式忽略以特定模式开头的字符串?

[英]How to ignore strings that start with certain pattern using regular expression in python?

Accept and return @something but reject first@last.接受并返回@something 但拒绝first@last。

r'@([A-Z][A-Z0-9_]*[A-Z0-9])

The above regexp will accept @something (starts with letter, ends with letter or number, may have underscore in middle, atleast 2 characters long) and returns the part after the @ symbol.上面的正则表达式将接受@something(以字母开头,以字母或数字结尾,中间可能有下划线,至少2个字符)并返回@符号后面的部分。

I do not want to return strings which contain some letters or number A-Z0-9 before the @ symbol.我不想在@符号之前返回包含一些字母或数字A-Z0-9的字符串。

Spaces, new lines, special characters, etc before @ is allowed.允许@之前的空格、换行符、特殊字符等。

CODE:代码:

re.findall(r'@([A-Z][A-Z0-9_]*[A-Z0-9])', text, re.I)

Use利用

re.findall(r'(?<![A-Z0-9])@([A-Z][A-Z0-9_]*[A-Z0-9])', text, re.I)

See regex proof .请参阅正则表达式证明

EXPLANATION解释

--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    [A-Z0-9]                 any character of: 'A' to 'Z', '0' to '9'
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  @                        '@'
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [A-Z]                    any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
    [A-Z0-9_]*               any character of: 'A' to 'Z', '0' to
                             '9', '_' (0 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
    [A-Z0-9]                 any character of: 'A' to 'Z', '0' to '9'
--------------------------------------------------------------------------------
  )                        end of \1

You can use您可以使用

\B@([A-Z][A-Z0-9_]*[A-Z0-9])

The pattern matches:模式匹配:

  • \B Assert a position where a word boundary does not match \B断言一个字边界不匹配的 position
  • @ Match literally @字面上匹配
  • ( Capture group 1 (捕获组 1
    • [AZ][A-Z0-9_]*[A-Z0-9]
  • ) Close group 1 )关闭第 1 组

Regex demo正则表达式演示

import re

text = "Accept and return @something but reject first@last."
print(re.findall(r'\B@([A-Z][A-Z0-9_]*[A-Z0-9])', text, re.I))

Output Output

['something']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用正则表达式将字符串放在 python 中某个“单词”的前面? - How to put strings in front of a certain 'word' in python by using regular expression? 正则表达式 - 如何消除python中的某些模式 - Regular expression - How to eliminate certain pattern in python Python 正则表达式:获取具有特定模式的字符串 - Python regular expression: Getting string with certain pattern 如何使用正则表达式搜索具有一定数量的出现的字符串? - How to search strings with certain number of occurence using Regular Expression? 如何修改此正则表达式以使用此模式提取字符串? - How to modify this regular expression to extract strings with this pattern? python 正则表达式:如何忽略不相关的匹配? - python regular expression: how to ignore the irrelevant matches? 如何使用正则表达式python删除两个字符之间的字符串 - How to remove strings between two characters using regular expression python 忽略 Python 中正则表达式匹配的开头或结尾处的空格或标点符号 - Ignore whitespace or punctuation at start or end of regular expression match in Python 如何在Python中使用正则表达式删除具有特殊字符串的字符? - How to remove characters with special strings using regular expression in Python? 如何在 python 中使用正则表达式查找某些单词? - How to find certain words using regular expression in python?
相关标签
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM