I am trying to find solution to find words in a text, must start with capital letter but not preceded by a space. example:
string1 = "MynameisStuartLittle" # expected result ["Mynameis","Stuart","Little"]
string2 = "MynameisStuart Little Junior" # expected result ["Mynameis","Stuart"]
string3 = "My name is AlphredHitchcock" # expected result ["My","Hitchcock"]
result = re.findall(r"([^ ]([A-Z][a-z]+))",string1)
print(result)
Other alternative solution I am thinking is that to split the text by SPACE and individually check each word with regex r"([AZ][az]+)" then if if length of findall is more than 1, then that word is eligible for my result. I am looking for any single regex solution.
You can use negative lookbehind (?<....)
:
import re
string1 = "MynameisStuartLittle"
string2 = "MynameisStuart Little Junior"
string3 = "My name is AlphredHitchcock"
print(re.findall(r"(?<! )[A-Z][a-z]*", string1)) # ['Mynameis', 'Stuart', 'Little']
print(re.findall(r"(?<! )[A-Z][a-z]*", string2)) # ['Mynameis', 'Stuart']
print(re.findall(r"(?<! )[A-Z][a-z]*", string3)) # ['My', 'Hitchcock']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.