简体   繁体   中英

how can i find the words in a string that start with capital letter?

How can I find the words in a string that start with a capital letter?

Example input:

input_str = "The Persian League is the largest sport event dedicated to the deprived areas of Iran. The Persian League promotes peace and friendship. This video was captured by one of our heroes who wishes peace."

Expected output:

Persian League Iran Persian League

Assuming you can accept The and This as well:

import re
input_string = "The Persian League is the largest sport event dedicated to the deprived areas of Iran. The Persian League promotes peace and friendship. This video was captured by one of our heroes who wishes peace."
matches = re.findall("([A-Z].+?)\W", input_string)

gives

['The', 'Persian', 'League', 'Iran', 'The', 'Persian', 'League', 'This']

If you need to ignore The and This :

matches = re.findall("(?!The|This)([A-Z].+?)\W", input_string)

gives

['Persian', 'League', 'Iran', 'Persian', 'League']

Without regex:

txt = "The Persian League is the largest sport event dedicated to the deprived areas of Iran. The Persian League promotes peace and friendship."

print([w for w in txt.split() if w.istitle()])

Output:

['The', 'Persian', 'League', 'Iran.', 'The', 'Persian', 'League']

If you want to skip the The word (or any other word for that matter) try this:

print(" ".join(w.replace(".", "") for w in txt.split() if w[0].isupper() and w not in ["The", "This"]))

Output:

Persian League Iran Persian League
s = """
The Persian League is the largest sport event dedicated to the deprived areas 
of Iran. The Persian League promotes peace and friendship. This video was 
captured by one of our heroes who wishes peace.
"""
print( [ x for x in s.split() if x[0].isupper() ])

Try this:

import re
inputString = "The Persian League is the largest sport event dedicated to the deprived areas of Iran. The Persian League promotes peace and friendship."
splitted = re.split(' |\.', inputString)
result = filter(lambda x: len(x) > 0 and x[0].isupper(), splitted)
print(list(result))

Result:

['The', 'Persian', 'League', 'Iran', 'The', 'Persian', 'League']

Another way to solve is using for to read data and put the words with capital letters in a list.

phrase = 'The Persian League is the largest sport event dedicated to the deprived areas of Iran. The Persian League promotes peace and friendship. This video was captured by one of our heroes who wishes peace.'

wordsplit = phrase.split(' ')
capitalLettersWords = []
for word in wordsplit:
    if word[0].isupper():
        capitalLettersWords.append(word)

print(capitalLettersWords)
#['The', 'Persian', 'League', 'Iran.', 'The', 'Persian', 'League', 'This']

In my example I used the str.isupper() and str.split() , both built-in methods from Python standard lib.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM