简体   繁体   English

到许多唯一词列表

[英]To many lists of Unique Words

This is a homework project from last week. 这是上周的家庭作业项目。 I had problems so did not turn it it. 我遇到了问题,所以没有打开它。 But I like to go back and see if I can make them work. 但是我喜欢回去看看是否可以使它们工作。 Now that I have it printing the right words in alphabetical order. 现在,我可以按字母顺序打印正确的单词。 I have the problem that it is printing 3 separate lists of unique words all with different number of words in the lists. 我有一个问题,它正在打印3个独立的单词列表,每个单词列表中的单词数量不同。 How can I fix this? 我怎样才能解决这个问题?

import string
def process_line(line_str,word_set):
    line_str=line_str.strip()
    list_of_words=line_str.split()
for word in list_of_words:
    if word!="--":
        word=word.strip()
        word=word.strip(string.punctuation)
        word=word.lower()
        word_set.add(word)
def pretty_print(word_set):
    list_of_words=[]
    for w in word_set:
        list_of_words.append(w)
        list_of_words.sort()
    for w in list_of_words:
        print(w,end=" ")
word_set=set([])
fObject=open("gettysburg.txt")
for line_str in fObject:
    process_line(line_str,word_set)
    print("\nlength of the word set: ",len(word_set))
    print("\nUnique words in set: ")
    pretty_print(word_set)

Below is the output I get, I only want it to give me the last one with the 138 words. 下面是我得到的输出,我只想让它给出138个单词的最后一个。 Appreciate any help. 感谢任何帮助。

length of the word set:  29

Unique words in set:

a ago all and are brought conceived continent created dedicated equal fathers forth four in liberty men nation new on our proposition score seven that the this to years 

length of the word set:  71

Unique words in set:

a ago all altogether and any are as battlefield brought can civil come conceived continent created dedicate dedicated do endure engaged equal fathers field final fitting for forth four gave great have here in is it liberty live lives long men met might nation new now of on or our place portion proper proposition resting score seven should so testing that the their this those to war we whether who years 

length of the word set:  138

Unique words in set:

a above add advanced ago all altogether and any are as battlefield be before birth brave brought but by can cause civil come conceived consecrate consecrated continent created dead dedicate dedicated detract devotion did died do earth endure engaged equal far fathers field final fitting for forget forth fought four freedom from full gave god government great ground hallow have here highly honored in increased is it larger last liberty little live lives living long measure men met might nation never new nobly nor not note now of on or our people perish place poor portion power proper proposition rather remaining remember resolve resting say score sense seven shall should so struggled take task testing that the their these they this those thus to under unfinished us vain war we what whether which who will work world years 

Take last 3 lines out of for: 从最后3行中取出:

....
for line_str in fObject:
    process_line(line_str,word_set)

print("\nlength of the word set: ",len(word_set))
print("\nUnique words in set: ")
pretty_print(word_set)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM