简体   繁体   中英

Python searching for two words regex

I'm trying to find if a sentence contains the phrase "go * to", for example "go over to", "go up to", etc. I'm using Textblob, and I know I can just use below:

search_go_to = set(["go", "to"])
go_to_blob = TextBlob(var)
matches = [str(s) for s in go_to_blob.sentences if search_go_to & set(s.words)]
print(matches)

but that would also return sentences like "go over there and bring this to him", which I don't want. Anyone know how I can do something like text.find("go * to")?

尝试使用:

for match in re.finditer(r"go\s+\w+\s+to", text, re.IGNORECASE):

Use generator expressions

>>> search_go_to = set(["go", "to"])
>>> m = ' .*? '.join(x for x in search_go_to)
>>> words = set(["go over to", "go up to", "foo bar"])
>>> matches = [s for s in words if re.search(m, s)]
>>> print(matches)
['go over to', 'go up to']

Try this

text = "something go over to something"

if re.search("go\s+?\S+?\s+?to",text):
    print "found"
else:
    print "not found"

Regex:-

\s is for any space
\S is for any non space including special characters
+? is for no greedy approach (not required in OP's question)

so re.search("go\\s+?\\S+?\\s+?to",text) would match "something go W#$%^^$ to something" and of course this too "something go over to something"

Does this work?

import re
search_go_to = re.compile("^go.*to$")
go_to_blob = TextBlob(var)
matches = [str(s) for s in go_to_blob.sentences if search_go_to.match(str(s))]
print(matches)

Explanation for the regex:

^    beginning of line/string
go   literal matching of "go"
.*   zero or more characters of any kind
to   literal matching of "to"
$    end of line/string

If you don't want "going to" to match, insert a \\\\b (word boundary) before to and after go .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM