简体   繁体   English

这个简单的正则表达式有什么问题?

[英]what is wrong with this simple regex?

I am stuck in a dump: 我陷入了垃圾堆:

import re
print re.search('return[^$]+',
                'return to the Treasury of $40 million\nnow!').group(0)

The above regex only prints return to the Treasury of , but I expected it to include $40 million . 上面的正则表达式只能return to the Treasury of ,但我希望其中包括$40 million What I understand from regex is that I am asking it to take every thing until the end of the line . 我从regex了解到的是,我要它take every thing until the end of the line

I do not want to use .* , I want endline delimiter to go until the end of line from some point. 我不想使用.* ,我希望endline delimiter从某点开始直到行尾。 If I remove $ from search string it prints the full string. 如果我从搜索字符串中删除$,则会打印完整字符串。 Why is endline delimiter matching with dollar sign?? 为什么终端定界符与美元符号匹配?

return[^$]+

will match a string "return" followed by any character that is not '$' one or more times. 将匹配字符串“ return”,后跟一个或多次不是“ $”的字符。

This is because [ ] mean character group and inside [ ] the special characters are threaded as simple characters. 这是因为[]表示字符组,而[]内的特殊字符被作为简单字符穿线。

Thus it matches only until the the dollar sign. 因此,它仅匹配到美元符号。

Why not use: 为什么不使用:

return.+$

this is exactly what you want. 这正是您想要的。

Why don't you want to use .* ? 您为什么不想使用.*

The regex you have will match any string that starts with "return", then one or more characters that are not the "$" character. 您拥有的正则表达式将匹配任何以“ return”开头的字符串,然后匹配一个或多个不是“ $”字符的字符。 Note that this will NOT look for the end-of-line marker. 请注意,这不会寻找行尾标记。

return.*$ will match everything up to and including the end of line marker. return.*$将匹配所有内容,包括行尾标记。 You may (but probably not) need to make the .* a lazy matcher if you are dealing with multi-line input. 如果要处理多行输入,则可能(但可能不需要)使.*成为惰性匹配器。

import re
text = 'we will return to the Treasury of $40 million\nunits of money.'
re.search(r'return.*$', text, re.MULTILINE).group(0)

# prints 'we will return to the Treasury of $40 million'

You need to include the multiline flag, then $ will match at newlines. 您需要包括多行标志,然后$将在换行符处匹配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM