I've managed to find the words beginning with capital Letters but can't figure out a regex to filter out the ones starting at the beginning of the sentence.
Each sentence ends with a full stop and a space.
Test_string = This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence.
This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence.
Desired output = ['Test', 'Supposed', 'Ignore', 'Words', 'Sentence']
I'm coding in Python. Will be glad if someone can help me out with the regex :)
You may use the following expression:
(?<!^)(?<!\. )[A-Z][a-z]+
Regex demo here .
import re
mystr="This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence."
print(re.findall(r'(?<!^)(?<!\. )[A-Z][a-z]+',mystr))
Prints:
['Test', 'Supposed', 'Ignore', 'Words', 'Sentence']
A very basic option. See here for an explanation.
[^.]\s([A-Z]\w+)
import re
s = 'This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence, And others.'
re.findall(r'[^.]\s([A-Z]\w+)', s)
output
['Test', 'Supposed', 'Ignore', 'Words', 'Sentence', 'And']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.