CS50 問題集 6，IndexError: list index out of range

Question

我不知道這里出了什么問題，但是在嘗試使用大型數據庫時收到錯誤消息，錯誤不斷彈出。 例如：

dna/ $ python dna.py databases/large.csv sequences/10.txt
Traceback (most recent call last):
  File "/workspaces/103840690/dna/dna.py", line 104, in <module>
    main()
  File "/workspaces/103840690/dna/dna.py", line 47, in main
    check[i][j] = False
IndexError: list index out of range

我知道這種類型的錯誤意味着我正在嘗試訪問一個不存在的索引，但我嘗試的任何方法似乎都不起作用。 同樣奇怪的是，我只在使用大型數據庫時才得到它。

問題可能在第 40 - 49 行，注釋“檢查數據庫是否匹配配置文件”在哪里，我只是粘貼了上下文的整個代碼

import csv
import sys


def main():

    # Check for command-line usage
    if len(sys.argv) != 3:
        print("Two command-line arguments needed. ")
        return 1


    # Read database file into a variable
    with open(sys.argv[1], "r") as csv_file:
        csv_database = csv.DictReader(csv_file)

        # create a list where we can put dictionaries
        database = []
        for lines in csv_database:
            database.append(lines)

        # create a keys list where we can put STRs
        STRs = []
        for key in database[0].keys():
            STRs.append(key)
        STRs.remove("name")


    # Read DNA sequence file into a variable
    with open(sys.argv[2], "r") as txt_file:
        sequence = txt_file.read()


    # Find longest match of each STR in DNA sequence
    matches = {}
    for i in range(len(STRs)):
        matches[STRs[i]] = longest_match(sequence, STRs[i])

    # Check database for matching profiles
    check = [[0]*len(database)]*len(STRs)
    match = None
    for i in range(len(database)):
        for j in range(len(STRs)):
            if matches[STRs[j]] == int(database[i][STRs[j]]):
                check[i][j] = True
            else:
                check[i][j] = False
        if False not in check[i]:
            match = i

    if match != None:
        print(database[match]["name"])
    else:
        print("No match")

    return


def longest_match(sequence, subsequence):
    """Returns length of longest run of subsequence in sequence."""

    # Initialize variables
    longest_run = 0
    subsequence_length = len(subsequence)
    sequence_length = len(sequence)

    # Check each character in sequence for most consecutive runs of subsequence
    for i in range(sequence_length):

        # Initialize count of consecutive runs
        count = 0

        # Check for a subsequence match in a "substring" (a subset of characters) within sequence
        # If a match, move substring to next potential match in sequence
        # Continue moving substring and checking for matches until out of consecutive matches
        while True:

            # Adjust substring start and end
            start = i + count * subsequence_length
            end = start + subsequence_length

            # If there is a match in the substring
            if sequence[start:end] == subsequence:
                count += 1

            # If there is no match in the substring
            else:
                break

        # Update most consecutive matches found
        longest_run = max(longest_run, count)

    # After checking for runs at each character in seqeuence, return longest run found
    return longest_run


main()

Answer 1

您的索引順序錯誤。 check 是一個 len(STRs) 元素的列表。 每個都是帶有 len(database) 元素的列表。

   # Check database for matching profiles
    check = [[0]*len(database)]*len(STRs)
    match = None
    for i in range(len(database)):
        for j in range(len(STRs)):
            if matches[STRs[j]] == int(database[i][STRs[j]]):
                check[i][j] = True
            else:
                check[i][j] = False
        if False not in check[i]:
            match = i

您正在使用變量 i 遍歷數據庫，並使用變量 j 遍歷 STR。 要將您的設置與 check 匹配，結果應存儲在check[j][i]以匹配check的初始化。

Answer 2

當你將一個列表相乘時，會發生的是，整個列表被相乘，而不是元素。 請參閱此示例。

a = [[0]*2]*5
print(a)
> [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0]]
print(a[4][1])
> 0

當您使用check = [[0]*len(database)]*len(STRs)時，列表的索引取決於 len(STRs)，如果您還想更深入地了解該列表，您可以根據len（數據庫）的值。 您需要通過此修改您的代碼。

for i in range(len(STRs)):
    for j in range(len(database)):
        if matches[STRs[j]] == int(database[i][STRs[j]]):
            check[i][j] = True
        else:
            check[i][j] = False
    if False not in check[i]:
        match = i

CS50 問題集 6，IndexError: list index out of range

問題描述

2 個解決方案

解決方案1
2 2022-06-29 09:35:27

解決方案2
1 已采納 2022-06-29 09:42:27

CS50 問題集 6，IndexError: list index out of range

問題描述

2 個解決方案

解決方案1 2 2022-06-29 09:35:27

解決方案2 1 已采納 2022-06-29 09:42:27

解決方案1
2 2022-06-29 09:35:27

解決方案2
1 已采納 2022-06-29 09:42:27