簡體   English   中英

在字符串中查找索引詞

[英]Find index word in string

用 Python 編寫一個程序,在文本輸出中打印關鍵字(以大寫字母開頭的單詞)以及單詞編號(多個單詞)。 如果在文本中找不到具有此屬性的單詞,則將其打印在 None 輸出中。 句子開頭的詞不應被視為索引詞。 (從一開始字數)

除索引詞外,不計算數字。 除了句號外,句子中使用的唯一符號是逗號。 如果分號在詞尾,請務必刪除分號。

輸入波斯聯盟是最大的體育賽事,致力於伊朗的貧困地區。 波斯聯盟促進和平與友誼。 這段視頻是由我們一位希望和平的英雄拍攝的。

輸出強文本 2:Persian 3:League 15:Iran 17:Persian 18:League

我該如何解決?://強文本

enter code here
import re

inputText = ""

# we will use this regex pattern to check if a word is started with upperCase
isUpperCase = re.compile("^([A-Z])\w+")

# we will store upperCase words in this array
result = []
# number of word in the hole input
wordIndex = 0

# separate sentences
sentences = inputText.strip().split('.')

for s in sentences:
 # get array of words in each sentence
 words = s.strip().split(' ')

 for index, word in enumerate(words):
 # increase wordIndex
 wordIndex += 1

 # just pass first word
 if index == 0:
 continue

 # check regex and if true add word and wordIndex to result
 if isUpperCase.match(word):
 result.append({
 "index": wordIndex,
 "word": word
 })


# finally print result
for word in result:
 print(word["index"], ": ", word["word"])

您可以將每個大寫單詞及其索引值(加一)添加到字典中。 我注意到你沒有返回TheThis但我不知道那里的規則是什么所以我只是為這兩個詞添加了一個豁免。

import collections
uppercase_letters = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'F', 'H', 'I', \
                     'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', \
                     'T', 'U', 'V', 'W', 'X', 'Y', 'Z']

mystring = 'The Persian League is the largest sport event dedicated to the deprived areas of Iran. The Persian League promotes peace and friendship. This video was captured by one of our heroes who wishes peace.'
counter_dict = collections.defaultdict(list)

counter = 0
overall_counter = 0
for i in mystring.split('.'):
    counter = 0
    for j in i.split():
        counter += 1
        overall_counter += 1
        if counter == 1:
            continue
        if j[0] in uppercase_letters:
            counter_dict[i].append(overall_counter)

counter_list = []
    
for i in counter_dict.values():
    for j in i:
        counter_list.append(j)

for i in sorted(counter_list):
    print(str(i) + ':', mystring.split()[i-1].rstrip('.'), '', sep=' ', end='', flush=True)

>>> 2: Persian 3: League 15: Iran 17: Persian 18: League 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM