简体   繁体   中英

Regex to find words starting with capital letters not at beginning of sentence

I've managed to find the words beginning with capital Letters but can't figure out a regex to filter out the ones starting at the beginning of the sentence.

Each sentence ends with a full stop and a space.

  • Test_string = This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence. This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence.

  • Desired output = ['Test', 'Supposed', 'Ignore', 'Words', 'Sentence']

I'm coding in Python. Will be glad if someone can help me out with the regex :)

You may use the following expression:

(?<!^)(?<!\. )[A-Z][a-z]+

Regex demo here .


import re
mystr="This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence."

print(re.findall(r'(?<!^)(?<!\. )[A-Z][a-z]+',mystr))

Prints:

['Test', 'Supposed', 'Ignore', 'Words', 'Sentence']

A very basic option. See here for an explanation.

[^.]\s([A-Z]\w+)

import re
s = 'This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence, And others.'
re.findall(r'[^.]\s([A-Z]\w+)', s)

output

['Test', 'Supposed', 'Ignore', 'Words', 'Sentence', 'And']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM