简体   繁体   English

从文件中导入随机单词而不重复Python

[英]importing random words from a file without duplicates Python

I'm attempting to create a program which selects 10 words from a text file which contains 10+ words. 我正在尝试创建一个程序,该程序从包含10个以上单词的文本文件中选择10个单词。 For the purpose of the program when importing these 10 words from the text file, I must not import the same words twice! 出于程序目的,从文本文件导入这10个单词时,我不能两次导入相同的单词! Currently I'm utilising a list for this however the same words seem to appear. 目前,我正在为此使用一个列表,但是似乎出现了相同的单词。 I have some knowledge of sets and know they cannot hold the same value twice. 我对集合有一定的了解,知道它们不能两次拥有相同的值。 As of now I'm clueless on how to solve this any help would be much appreciated. 到目前为止,我对如何解决此问题一无所知,将不胜感激。 THANKS! 谢谢!

please find relevant code below! 请在下面找到相关代码! -(ps FileSelection is basically open file dialog) -(ps FileSelection基本上是打开文件对话框)

def GameStage03_E():
    global WordList
    if WrdCount >= 10:
        WordList = []
        for n in range(0,10):
            FileLines = open(FileSelection).read().splitlines()
            RandWrd = random.choice(FileLines)
            WordList.append(RandWrd)
        SelectButton.destroy()
        GameStage01Button.destroy()
        GameStage04_E()
    elif WrdCount <= 10:
        tkinter.messagebox.showinfo("ERROR", " Insufficient Amount Of Words Within Your Text File! ")

Make WordList a set : WordList set

WordList = set()

Then update that set instead of appending: 然后update该集合,而不是追加:

WordList.update(set([RandWrd]))

Of course WordList would be a bad name for a set. 当然, WordList对于集合来说是个坏名字。

There are a few other problems though: 但是,还有其他一些问题:

  • Don't use uppercase names for variables and functions (follow PEP8 ) 不要对变量和函数使用大写名称(遵循PEP8
  • What happens if you draw the same word twice in your loop? 如果您在循环中两次绘制相同的单词会怎样? There is no guarantee that WordList will contain 10 items after the loop completes, if words may appear multiple times. 如果单词可能出现多次,则不能保证循环完成后WordList将包含10个项目。

The latter might be addressed by changing your loop to: 可以通过将循环更改为来解决后者:

    while len(WordList) < 10:
        FileLines = open(FileSelection).read().splitlines()
        RandWrd = random.choice(FileLines)
        WordList.update(set([RandWrd]))

You would have to account for the case that there don't exist 10 distinct words after all, though. 但是,您将不得不考虑根本不存在10个不同的单词的情况。

Even then the loop would still be quite inefficient as you might draw the same word over and over and over again with random.choice(FileLines) . 即使那样,循环仍然会非常低效,因为您可能会使用random.choice(FileLines)一遍random.choice(FileLines)绘制相同的单词。 But maybe you can base something useful off of that. 但是也许您可以从中得出一些有用的信息。

not sure i understand you right, but ehehe, line 3: "if wrdcount" . 不确定我是否理解正确,但是,第3行:“ if wrdcount”。 . where dit you give wrdcount a value ? 您在哪里给wrdcount一个值? Maybe you intent something along the line below?: 也许您打算沿着以下方向进行操作?:

wordset = {}
wrdcount = len(wordset)
while wrdcount < 10:
    # do some work to update the setcode here
    # when end-of-file break

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM