简体   繁体   中英

Find index word in string

Write a program in Python that prints keywords (words starting with uppercase letters) along with the word number (multiple words) in the output of a text. If no word with this attribute is found in the text, print it in the None output. Words at the beginning of a sentence should not be considered as an index word. (Start word number from one)

Numbers are not counted except index words. The only symbol used in a sentence except for a period is a comma. Be sure to remove the semicolon if it was at the end of the word.

input The Persian League is the largest sport event dedicated to the deprived areas of Iran. The Persian League promotes peace and friendship. This video was captured by one of our heroes who wishes peace.

outputstrong text 2:Persian 3:League 15:Iran 17:Persian 18:League

how can i fix it?:// strong text

enter code here
import re

inputText = ""

# we will use this regex pattern to check if a word is started with upperCase
isUpperCase = re.compile("^([A-Z])\w+")

# we will store upperCase words in this array
result = []
# number of word in the hole input
wordIndex = 0

# separate sentences
sentences = inputText.strip().split('.')

for s in sentences:
 # get array of words in each sentence
 words = s.strip().split(' ')

 for index, word in enumerate(words):
 # increase wordIndex
 wordIndex += 1

 # just pass first word
 if index == 0:
 continue

 # check regex and if true add word and wordIndex to result
 if isUpperCase.match(word):
 result.append({
 "index": wordIndex,
 "word": word
 })


# finally print result
for word in result:
 print(word["index"], ": ", word["word"])

You can add each uppercase word to a dictionary with their index values (plus one). I noticed that you didn't return The or This but I don't know what the rule is there so I just added an exemption for those two words.

import collections
uppercase_letters = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'F', 'H', 'I', \
                     'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', \
                     'T', 'U', 'V', 'W', 'X', 'Y', 'Z']

mystring = 'The Persian League is the largest sport event dedicated to the deprived areas of Iran. The Persian League promotes peace and friendship. This video was captured by one of our heroes who wishes peace.'
counter_dict = collections.defaultdict(list)

counter = 0
overall_counter = 0
for i in mystring.split('.'):
    counter = 0
    for j in i.split():
        counter += 1
        overall_counter += 1
        if counter == 1:
            continue
        if j[0] in uppercase_letters:
            counter_dict[i].append(overall_counter)

counter_list = []
    
for i in counter_dict.values():
    for j in i:
        counter_list.append(j)

for i in sorted(counter_list):
    print(str(i) + ':', mystring.split()[i-1].rstrip('.'), '', sep=' ', end='', flush=True)

>>> 2: Persian 3: League 15: Iran 17: Persian 18: League 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM