I want to get a regex which can tell if two given words are in one sentence (word order matters). The problem is that I can have a contraction in a sentence, so the period doesn't indicate that there's the end of the sentence. The part of regex which indicates the end of the sentence is
\\.(\s+[A-Z]|\s*$)
What would the pattern look like?
You could use this:
(\b\w+\b)(?:[^.]|\.\s)*(\b\w+\b)
This basically says, match and capture a word, then anything that is not a period, or a period followed ba space, any number of times, and finally match and capture another word.
EDIT: For given words in either order, use:
(\bWord1\b)(?:[^.]|\.\s)*(\bWord2\b)|(\bWord2\b)(?:[^.]|\.\s)*(\bWord1\b)
Not c#, but you should get the idea
for sentence in split_text_with_regex(text):
index_word1 = sentence.find(word1)
index_word2 = sentence.find(word2)
# do your thing
Also you can construct the regular expression in Visual Studio itself . Refer to this link http://msdn.microsoft.com/en-us/library/2k3te2cs(VS.80).aspx first paragraph
So I think it's something like this (untested):
(([\w\s]*\s)?Word1\s([\w\s]*)?\sWord2(\s[\w\s]*)?\.)(?=(\s+[A-Z]|\s*$))
Edit: Thinking about it, that won't match punctuation (commas, apostrophes). Perhaps each [\\w\\s] should be [^\\.] or a list of possible characters.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.