简体   繁体   中英

Python how can I print which sentence a word is in?

Making a concordance program. I want it to tell me what sentence a word is in, so if I have:

"Hello world. My name is Nathan and I need help on Python. I am very confused and any help is appreciated."

I want it to print which sentence each word comes from. I already have completed that it counts the total number of times each word appears and next to it I need the sentence number(s) it comes from, so it displays as:

a. word {word appearance count:sentence number}

with 'a.' working as the list order (like a numbered list but with letters). An example from the first sentence would be

a. help {2:2,3}

Here's the code I currently have:

word_counter = {}
sent_num = {}
linenum = 0
wordnum = 0
counter = 0

#not working
for word in f.lower().split('.'):
    if not word in sent_num:
        sent_num[word] = []
    sent_num[word].append(f.find(wordnum))


#working correctly
for word in f.lower().split():
if not word in word_counter:
        word_counter[word] = []
        #if the word isn't listed yet, adds it
    word_counter[word].append(linenum)

for key in sorted(word_counter):
    counter += 1
    print (counter, key, len(word_counter[key]), len(sent_num[key]))

In your code, when you look at full sentences, you are only splitting on '.' . You need to split each sentence into words, after that:

for sentence in f.split('.'):
    for word in sentence.lower().split():
        if not word in sent_num:
            sent_num[word] = []
        sent_num[word].append(f.find(wordnum))

or something along those ways, depending on what you want to look at and count.

It's pretty simple to iterate over each sentence then each word in that sentence and create a dictionary that maps {word: [sentence, ...]} :

In [1]:
d = {}
for i, sent in enumerate(f.lower().split('. ')):
    for w in sent.strip().split():
        d.setdefault(w, []).append(i)
d

Out[1]:
{'am': [2],
 'and': [1, 2],
 'any': [2],
 'appreciated.': [2],
 'confused': [2],
 'hello': [0],
 'help': [1, 2],
 ...}

Given the list is all the occurrences of the word then you can just get the count by call len() , eg:

In [2]:
len(d['help'])

Out[2]:
2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM