[英]Attempting to add word frequency values to a python dictionary that already has keys but no values
any input would be much appreciated. 任何输入将不胜感激。 I've been stuck on this for a long while and a little desperate. 我已经坚持了很长一段时间,有点绝望。 Below is a portion of my code that is supposed to read a text file and print a keyword and the number of occurrences of that keyword. 下面是我的代码的一部分,应该读取一个文本文件并打印一个关键字以及该关键字的出现次数。 I created a dictionary with a key but no values. 我创建了一个有键但没有值的字典。 When I attempt to add values I receive this error message: "TypeError: list indices must be integers or slices, not dict". 当我尝试添加值时,我收到以下错误消息:“ TypeError:列表索引必须是整数或切片,而不是dict”。 The error is produced for this line of code: 该行代码产生错误:
position_term = {uat_dic[key_word], word_position}'.
I can answer any questions or provided additional info as needed. 我可以根据需要回答任何问题或提供其他信息。 Thanks so much for any help given. 非常感谢您提供的任何帮助。
import string
def main():
# Call a function to create the keyword frequency dictionary
create_keyword_frequency_dictionary()
# Call a function to calculate the keyword properties: frequency and position
keyword_frequency, keyword_position = calculate_keyword_properties()
# Call a function to display the results as in the Sample Output
display_results(keyword_frequency, keyword_position)
def create_keyword_frequency_dictionary():
# Open the uat_voc.txt file and create a dictionary where
# the key is the term found in the file, and the value is initialized to 0.
# This function only initializes the dictionary.
# Values for the keyword frequencies are set by the calculate_keyword_properties function.
#easy to use variable name that holds the file path, can easily change the file location here
file_name = "/Users/ccs/Library/Mobile Documents/com~apple~TextEdit/Documents/uat_voc.txt"
#create an empty dictionary
keyword_frequency_dictionary = []
#wrap the i/o in a try/catch statement
try:
#open the file and load it into infile
infile = open(file_name, 'r')
#for each line in infile
for line in infile:
#take the line, slice off the \n character
line = line[:-1]
#create a dictionary term where the key is the word in line, initialize value to 0
dic_term = {line: 0}
#add the dictionary term to the dictionary list
keyword_frequency_dictionary.append(dic_term)
#close the infile
infile.close()
# Catch IO Errors, with the File Not Found error the primary possible problem to detect.
except FileNotFoundError:
print("File not found when attempting to read", file_name)
return None
except IOError:
print("Error in data file when reading", file_name)
infile = None
infile.close()
return None
return keyword_frequency_dictionary
def calculate_keyword_properties():
# Open the HowBigDataIsChangingAstronomy.txt' file
# Read each line in the file and normalize the text as in Assignment 3
# For each word, determine if it is in the keyword_frequency dictionary, i.e, it is a keyword in the UAT vocabulary,
# If the word is a UAT keyword, then increment the frequency.
# For each word, if it is the first occurrence in the file, then save its position in a keyword_position dictionary
# Open the HowBigDataIsChangingAstronomy.txt' file
infile = open('/Users/ccs/Library/Mobile Documents/com~apple~TextEdit/Documents/HowBigDataIsChangingAstronomy.txt','r')
#read the entire file into clean_file and normalize (this file still has '' instead of ---) length = 2982
clean_file = remove_punctuation(infile.read())
#create empty list that will fill with words
clean_list_of_words = []
#create the keyword dictionary
uat_dic = create_keyword_frequency_dictionary()
#create the keyword_position dictionary
keyword_position_dictionary = []
print(clean_file)
print(len(clean_file))
#reparse the file and remove all '' from normalized file (length = 2972)
for line in clean_file:
for word in line.split():
clean_list_of_words.append(word)
print(len(clean_list_of_words))
#iterate through the clean list of words one word at a time to compare; store the word in word_position
for word_position in clean_list_of_words:
#iterate through the dictionary so you can compare the word to each entry
for key_word in iter(uat_dic):
#IDK what was happening here, i think this is wrong
list_word = clean_list_of_words.index(word_position)
dic_word = uat_dic.index(key_word)
#compare the word from word_position to each key value in the uat_dictionary
if list_word == dic_word:
#if it was a match, get the value, store it in frequency
frequency = uat_dic.index(key_word)
#increment frequency
frequency+=1
#check to see if this was the first occurance of the term
if frequency == 1:
#if this was the first occurrence, then store a dictionary key/value pair with word_position as the place in the document
#uat_dic = {uat_dic[key_word], word_position}
position_term = {uat_dic[key_word], word_position}
#add the dictionary pair to the key word position list
keyword_position_dictionary.append(position_term)
#update the frequency value in the uat dictionary
uat_dic[key_word] = frequency
#keeps looping through for every word in the text document
#return the uat frequency dictionary and the keyword position dictionary
return uat_dic, keyword_position_dictionary
# For each word, determine if it is in the keyword_frequency dictionary, i.e, it is a keyword in the UAT vocabulary,
# If the word is a UAT keyword, then increment the frequency.
# For each word, if it is the first occurrence in the file, then save its position in a keyword_position dictionary
Sometimes you just have to refactor: 有时您只需要重构:
def count_word(file_name, word):
return file(file_name).read().count(word)
This will count the number of occurrences of word
in the file, which is what it looks like you're trying to do. 这将计算文件中word
出现的次数,这就是您要尝试执行的操作。
Since you're asking to count a list of keywords, something like this should do it: 由于您要计算关键字列表,因此应执行以下操作:
import collections # this is from the standard library
def count_word(file_name, words):
'''
Takes a file name and a list of words to count.
'''
# initialize a counter
cnt = collections.Counter()
# tokenize the file
tokens = file(file_name).read().strip().split()
for token in tokens:
if token in words:
# increment the counter for that token
cnt[token] += 1
return cnt
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.