[英]CS50 Problem Set 6, IndexError: list index out of range
我不知道這里出了什么問題,但是在嘗試使用大型數據庫時收到錯誤消息,錯誤不斷彈出。 例如:
dna/ $ python dna.py databases/large.csv sequences/10.txt
Traceback (most recent call last):
File "/workspaces/103840690/dna/dna.py", line 104, in <module>
main()
File "/workspaces/103840690/dna/dna.py", line 47, in main
check[i][j] = False
IndexError: list index out of range
我知道這種類型的錯誤意味着我正在嘗試訪問一個不存在的索引,但我嘗試的任何方法似乎都不起作用。 同樣奇怪的是,我只在使用大型數據庫時才得到它。
問題可能在第 40 - 49 行,注釋“檢查數據庫是否匹配配置文件”在哪里,我只是粘貼了上下文的整個代碼
import csv
import sys
def main():
# Check for command-line usage
if len(sys.argv) != 3:
print("Two command-line arguments needed. ")
return 1
# Read database file into a variable
with open(sys.argv[1], "r") as csv_file:
csv_database = csv.DictReader(csv_file)
# create a list where we can put dictionaries
database = []
for lines in csv_database:
database.append(lines)
# create a keys list where we can put STRs
STRs = []
for key in database[0].keys():
STRs.append(key)
STRs.remove("name")
# Read DNA sequence file into a variable
with open(sys.argv[2], "r") as txt_file:
sequence = txt_file.read()
# Find longest match of each STR in DNA sequence
matches = {}
for i in range(len(STRs)):
matches[STRs[i]] = longest_match(sequence, STRs[i])
# Check database for matching profiles
check = [[0]*len(database)]*len(STRs)
match = None
for i in range(len(database)):
for j in range(len(STRs)):
if matches[STRs[j]] == int(database[i][STRs[j]]):
check[i][j] = True
else:
check[i][j] = False
if False not in check[i]:
match = i
if match != None:
print(database[match]["name"])
else:
print("No match")
return
def longest_match(sequence, subsequence):
"""Returns length of longest run of subsequence in sequence."""
# Initialize variables
longest_run = 0
subsequence_length = len(subsequence)
sequence_length = len(sequence)
# Check each character in sequence for most consecutive runs of subsequence
for i in range(sequence_length):
# Initialize count of consecutive runs
count = 0
# Check for a subsequence match in a "substring" (a subset of characters) within sequence
# If a match, move substring to next potential match in sequence
# Continue moving substring and checking for matches until out of consecutive matches
while True:
# Adjust substring start and end
start = i + count * subsequence_length
end = start + subsequence_length
# If there is a match in the substring
if sequence[start:end] == subsequence:
count += 1
# If there is no match in the substring
else:
break
# Update most consecutive matches found
longest_run = max(longest_run, count)
# After checking for runs at each character in seqeuence, return longest run found
return longest_run
main()
您的索引順序錯誤。 check 是一個 len(STRs) 元素的列表。 每個都是帶有 len(database) 元素的列表。
# Check database for matching profiles
check = [[0]*len(database)]*len(STRs)
match = None
for i in range(len(database)):
for j in range(len(STRs)):
if matches[STRs[j]] == int(database[i][STRs[j]]):
check[i][j] = True
else:
check[i][j] = False
if False not in check[i]:
match = i
您正在使用變量 i 遍歷數據庫,並使用變量 j 遍歷 STR。 要將您的設置與 check 匹配,結果應存儲在check[j][i]
以匹配check
的初始化。
當你將一個列表相乘時,會發生的是,整個列表被相乘,而不是元素。 請參閱此示例。
a = [[0]*2]*5
print(a)
> [[0, 0], [0, 0], [0, 0], [0, 0], [0, 0]]
print(a[4][1])
> 0
當您使用check = [[0]*len(database)]*len(STRs)
時,列表的索引取決於 len(STRs),如果您還想更深入地了解該列表,您可以根據len(數據庫)的值。 您需要通過此修改您的代碼。
for i in range(len(STRs)):
for j in range(len(database)):
if matches[STRs[j]] == int(database[i][STRs[j]]):
check[i][j] = True
else:
check[i][j] = False
if False not in check[i]:
match = i
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.