简体   繁体   English

如何使用Python Regex查找所有后跟符号的单词?

[英]How to find all words followed by symbol using Python Regex?

I need re.findall to detect words that are followed by a "=" 我需要re.findall来检测后跟a "="单词

So it works for an example like 所以它适用于一个例子

re.findall('\w+(?=[=])', "I think Python=amazing")

but it won't work for "I think Python = amazing" or "Python =amazing"... I do not know how to possibly integrate the whitespace issue here properly. 但不适用于“我认为Python =令人惊叹”或“ Python =令人惊叹” ...我不知道如何在此处正确集成空白问题。

Thanks a bunch! 谢谢一群!

You said "Again stuck in the regex" probably in reference to your earlier question Looking for a way to identify and replace Python variables in a script where you got answers to the question that you asked, but I don't think you asked the question you really wanted the answer to. 您说“再次卡在正则表达式中”可能是参考您先前的问题寻找在脚本标识和替换Python变量的方法,在该脚本中您可以找到所要问题的答案,但我不认为您是在问这个问题您真的想要答案。

You are looking to refactor Python code, and unless your tool understands Python, it will generate false positives and false negatives; 您正在寻找重构Python代码的方法,除非您的工具能理解 Python,否则它将产生假阳性和假阴性。 that is, finding instances of variable = that aren't assignments and missing assignments that aren't matched by your regexp. 也就是说,查找variable =实例,这些实例不是您的正则表达式不匹配的分配和丢失的分配。

There is a partial list of tools at What refactoring tools do you use for Python? 您可以使用Python的哪些重构工具来部分列出工具 and more general searches with "refactoring Python your_editing_environment" will yield more still. 并且使用“重构Python your_editing_environment”进行更多常规搜索将产生更多结果。

'(\w+)\s*=\s*'
re.findall('(\w+)\s*=\s*', 'I think Python=amazing')   \\ return 'Python'
re.findall('(\w+)\s*=\s*', 'I think Python = amazing') \\ return 'Python'
re.findall('(\w+)\s*=\s*', 'I think Python =amazing')  \\ return 'Python'

只需在=之前添加一些可选的空格:

\w+(?=\s*=)

Use this instead 改用这个

 re.findall('^(.+)(?=[=])', "I think Python=amazing")

Explanation 说明

# ^(.+)(?=[=])
# 
# Options: case insensitive
# 
# Assert position at the beginning of the string «^»
# Match the regular expression below and capture its match into backreference number 1 «(.+)»
#    Match any single character that is not a line break character «.+»
#       Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=[=])»
#    Match the character “=” «[=]»

You need to allow for whitespace between the word and the = : 您需要在单词和=之间留出空格:

re.findall('\w+(?=\s*[=])', "I think Python = amazing")

You can also simplify the expression by using a capturing group around the word, instead of a non-capturing group around the equals: 您还可以通过在单词周围使用捕获组而不是在等号周围使用非捕获组来简化表达式:

re.findall('(\w+)\s*=', "I think Python = amazing")

r'(.*)=.*' would do it as well ... r'(.*)=.*'也可以做到...

You have anything #1 followed with a = followed with anything #2, you get anything #1. 您有#1后面跟有= ,再有#2后面有任何东西,您得到了#1。

>>> re.findall(r'(.*)=.*', "I think Python=amazing")
['I think Python']
>>> re.findall(r'(.*)=.*', "  I think Python =    amazing oh yes very amazing   ")
['  I think Python ']
>>> re.findall(r'(.*)=.*', "=  crazy  ")
['']

Then you can strip() the string that is in the list returned. 然后,您可以strip()返回列表中的字符串。

re.split(r'\s*=', "I think Python=amazing")[0].split() # returns ['I', 'think', 'Python']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM