简体   繁体   English

Python正则表达式匹配可能的单词

[英]python regex match a possible word

I want to match a regex to match a word that might not exist. 我想匹配一个正则表达式来匹配一个可能不存在的单词。 I read here that I should try something like this: 在这里读到我应该尝试这样的事情:

import re

line = "a little boy went to the small garden and ate an apple"


res = re.findall("a (little|big) (boy|girl) went to the (?=.*\bsmall\b) garden and ate a(n?)",line)

print res

but the output of this is 但是这个的输出是

[]

which is also the output if I set line to be 如果我将line设置为

a little boy went to the garden and ate an apple 一个小男孩去花园里吃了一个苹果

How do I allow for a possible word to exist or not exist in my text and catch it if it exist? 如何允许文本中可能存在的单词存在或不存在,并在存在的情况下将其捕获?

First, you need to match not only a "small" word, but also a space after that (or before that). 首先,您不仅需要匹配一个“小”字,还需要匹配一个空格(在此之后(或之前))。 So you could use regex like this: (small )? 因此,您可以像这样使用正则表达式: (small )? . On the other hand you want to catch the word only. 另一方面,您只想抓住这个词。 To exclude the match from capturing you should use regex like this: (?:(small) )? 要从捕获中排除匹配项,您应使用如下正则表达式(?:(small) )?

Full example: 完整示例:

import re

lines = [
    'a little boy went to the small garden and ate an apple',
    'a little boy went to the garden and ate an apple'
]

for line in lines:
    res = re.findall(r'a (little|big) (boy|girl) went to the (?:(small) )?garden and ate a(n?)', line)
    print res

Output: 输出:

[('little', 'boy', 'small', 'n')]
[('little', 'boy', '', 'n')]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM