[英]How can I find a word ending with "n't" before a number using Regex?
I want to find all "n't" in a sentence with this Regex 'n't [0-9]+(\\.[0-9][0-9]?)?'
我想用这个正则表达式
'n't [0-9]+(\\.[0-9][0-9]?)?'
在一个句子中找到所有的“n't” . . And its working fine in RegExr :
它在RegExr 中工作正常:
but When I try to do it with this code, it does not work:但是当我尝试使用此代码执行此操作时,它不起作用:
txt = "japan isn't 56 country in Europe."
nt = re.findall(r"n't [0-9]+(\.[0-9][0-9]?)?",txt)
print(nt)
findall
is slightly weird when it comes to parentheses.当涉及到括号时
findall
。 Once you have them in there, it only returns the result of that group, not of the entire match.一旦你把它们放在那里,它只会返回该组的结果,而不是整个比赛的结果。 You can make the parentheses non-capturing:
您可以使括号不被捕获:
>>> nt = re.findall(r"n't [0-9]+(?:\.[0-9][0-9]?)?",txt)
>>> print(nt)
["n't 56"]
This is a subtle problem with your script, which the following fixes:这是您的脚本的一个微妙问题,以下修复了该问题:
txt = "japan isn't 56 country in Europe."
nt = re.findall(r"n't [0-9]+(?:\.[0-9][0-9]?)?",txt)
print(nt) # prints ["n't 56"]
In your original call to re.findall
, you were using this pattern:在您对
re.findall
的最初调用中,您使用了以下模式:
n't [0-9]+(\.[0-9][0-9]?)?
This means that the first capture group is the optional term .123
.这意味着第一个捕获组是可选术语
.123
。 With the re.findall
API, if you specify a capture group, then it is what will be returned.使用
re.findall
API,如果您指定一个捕获组,那么它将返回。 Given that your input did not contain this group, your resulting list was empty.鉴于您的输入不包含该组,您的结果列表为空。 In my corrected version, I made the capturing group inactive , using
?:
.在我更正的版本中,我使用
?:
使捕获组处于非活动状态。 If you don't specify any explicit capture groups, then the entire matching pattern will be returned, which is the behavior you want here.如果您未指定任何显式捕获组,则将返回整个匹配模式,这是您在此处想要的行为。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.