简体   繁体   English

从 Python 中的列表中删除单词的问题

[英]Issues removing words from a list in Python

I'm building a Wordle solver.我正在构建一个 Wordle 求解器。 Basically removing words from a list, if they don't have specific characters, or don't have them at specific locations.基本上从列表中删除单词,如果它们没有特定字符,或者没有它们在特定位置。 I'm not concerned about the statistics for optimal choices yet.我还不关心最优选择的统计数据。

When I run the below code (I think all relevant sections are included), my output is clear that it found a letter matching position to the 'word of the day'.当我运行下面的代码时(我认为所有相关部分都包括在内),我的 output 很明显它找到了一个匹配 position 的字母到“今日词汇”。 But then the next iteration, it will choose a word that doesn't have that letter, when it should only select from remaining words.但是在下一次迭代中,它会选择一个没有那个字母的单词,而此时它应该只从剩余的单词中选择 select。

Are words not actually being removed?单词实际上没有被删除吗? Or is there something shadowing a scope I can't find?还是有什么东西隐藏了我找不到的 scope? I've rewritten whole sections, with the exact same problem happening.我重写了整个部分,发生了完全相同的问题。

#Some imports and reading the word list here. 

def word_compare(word_of_the_day, choice_word):
    results = []
    index = 0
    letters[:] = choice_word
    for letter in letters:
        if letter is word_of_the_day[index]:
            results.append((letter, 2, index))
        elif letter in word_of_the_day:
            results.append((letter, 1, index))
        else:
            results.append((letter, 0, index))
        index += 1
    print("\nIteration %s\nWord of the Day: %s,\nChoice Word: %s,\nResults: %s" % (
        iteration, word_of_the_day, choice_word, results))
    return results


def remove_wrong_words():
    for item in results:
        if item[1] == 0:
            for word in words:
                if item[0] in word:
                    words.remove(word)
    for item in results:
        if item[1] == 2:
            for word in words:
                if word[item[2]] != item[0]:
                    words.remove(word)
    print("Words Remaining: %s" % len(words))
    return words


words, letters = prep([])
# choice_word = best_word_choice()
choice_word = "crane"
iteration = 1
word_of_the_day = random.choice(words)

while True:
    if choice_word == word_of_the_day:
        break
    else:
        words.remove(choice_word)
        results = word_compare(word_of_the_day, choice_word)
        words = remove_wrong_words()
        if len(words) < 10:
            print(words)
        choice_word = random.choice(words)
        iteration += 1

Output I'm getting: Output 我得到:

Iteration 1
Word of the Day: stake,
Choice Word: crane,
Results: [('c', 0, 0), ('r', 0, 1), ('a', 2, 2), ('n', 0, 3), ('e', 2, 4)]
Words Remaining: 386

Iteration 2
Word of the Day: stake,
Choice Word: lease,
Results: [('l', 0, 0), ('e', 1, 1), ('a', 2, 2), ('s', 1, 3), ('e', 2, 4)]
Words Remaining: 112

Iteration 3
Word of the Day: stake,
Choice Word: paste,
Results: [('p', 0, 0), ('a', 1, 1), ('s', 1, 2), ('t', 1, 3), ('e', 2, 4)]
Words Remaining: 81

Iteration 4
Word of the Day: stake,
Choice Word: spite,

... This continues for a while until solved. ... 这会持续一段时间,直到解决。 In this output, 'a' is found to be in the correct place (value of 2 in the tuple) on the second iteration.在这个 output 中,'a' 在第二次迭代中被发现位于正确的位置(元组中值为 2)。 This should remove all words from the list that don't have 'a' as the third character.这应该从列表中删除第三个字符不是“a”的所有单词。 Instead 'paste' and 'spite' are chosen for later iterations from that same list, instead of having been removed.相反,'paste' 和 'spite' 是从同一个列表中选择用于以后的迭代,而不是被删除。

Your issue has to do with removing an item from a list while you iterate over it.您的问题与在迭代时从列表中删除项目有关。 This often results in skipping later values, as the list iteration is being handled by index, under the covers.这通常会导致跳过后面的值,因为列表迭代是由索引在幕后处理的。

Specifically, the problem is here (and probably in the other loop too):具体来说,问题就在这里(也可能在另一个循环中):

for word in words:
    if item[0] in word:
        words.remove(word)

If the if condition is true for the first word in the words list, the second word will not be checked.如果words列表中的第一个单词的if条件为真,则不会检查第二个单词。 That's because when the for loop asks the list iterator for the next value, it's going to yield the second value of the list as it now stands , which is going to be the third value from the original list (since the first one is gone).那是因为当for循环向列表迭代器询问下一个值时,它将产生列表的第二个值,因为它现在是这样的,这将是原始列表中的第三个值(因为第一个已经不存在了) .

There are a few ways you could avoid this problem.有几种方法可以避免此问题。

One approach is to iterate on a copy of the list you're going to modify.一种方法是迭代您要修改的列表的副本。 This means that the iterator won't ever skip over anything, since the copied list is not having anything removed from it as you go (only the original list is changing).这意味着迭代器永远不会跳过任何东西,因为复制的列表在您 go 时没有从中删除任何内容(只有原始列表正在更改)。 A common way to make the copy is with a slice:制作副本的一种常见方法是使用切片:

for word in words[:]:       # iterate on a copy of the list
    if item[0] in word:
        words.remove(word)  # modify the original list here

Another option is to build a new list full of the valid values from the original list, rather than removing the invalid ones.另一种选择是构建一个包含原始列表中有效值的新列表,而不是删除无效值。 A list comprehension is often good enough for this:列表推导式通常就足够了:

words = [word for word in words if item[0] not in word]

This may be slightly complicated in your example because you're using global variables.这在您的示例中可能会稍微复杂一些,因为您使用的是全局变量。 You would either need to change that design (and eg accept a list as an argument and return the new version), or add global words statement to let the function's code rebind the global variable (rather than modifying it in place).您可能需要更改该设计(例如,接受一个列表作为参数并返回新版本),或者添加global words语句以让函数的代码重新绑定全局变量(而不是就地修改它)。

I think one of your issues is the following line: if letter is word_of_the_day[index]: .我认为您的问题之一是以下行: if letter is word_of_the_day[index]: This should be == not is as the latter checks for whether the two objects being compared have the same memory address (ie id() ), not whether they have the same value.这应该是== not is因为后者检查被比较的两个对象是否具有相同的 memory 地址(即id() ),而不是它们是否具有相同的值。 Thus, results will never return a tuple with a value of 2 in position 1, so this means the second for loop in remove_wrong_words won't do anything either.因此, results永远不会在 position 1 中返回值为 2 的元组,因此这意味着remove_wrong_words中的第二个 for 循环也不会执行任何操作。 There may be more going on but I'd like a concrete example to run before digging in further.可能还有更多的事情要做,但我想在进一步挖掘之前运行一个具体的例子。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM