[英]Python: Saved pickled Counter has data, but cannot load the file with a function
I'm trying to build a foreign language frequency dictionary/vocab learner. 我正在尝试构建外语频率词典/词汇学习器。
I want the program to: 我希望程序执行以下操作:
Counter()
) Counter()
进行此操作) Counter()
to a pickle file so that I don't have to process the book every time I run the program Counter()
保存到一个pickle文件中,这样我就不必在每次运行该程序时都要处理这本书 most_common()
function) most_common()
函数轻松完成) Here is the problem, once I process a book and save it to a pickle file, I cannot access it again. 这是问题所在,一旦我处理完一本书并将其保存到泡菜文件中,就无法再次访问它。 The function that does so, loads an empty dictionary even though, when I check the pickle file, I can see that it does have data.
这样做的功能会加载一个空字典,即使当我检查pickle文件时,我也可以看到它确实有数据。
Further more, if I load the pickle file manually (using pickle.load()
) and pull the Nth most common word manually (using most_common()
manually instead of a custom function which loads the pickle and pulls the Nth most common word) it will work perfectly. 此外,如果我手动加载pickle文件(使用
pickle.load()
)并手动拉出第N个最常用的单词(手动使用most_common()
而不是自定义函数来加载pickle并拉出第N个最常见的单词),将完美地工作。
I suspect there is something wrong with the custom function that loads pickle files, but I can't figure out what it is. 我怀疑加载泡菜文件的自定义函数出了点问题,但我不知道这是什么。
Here is the code: 这是代码:
import string
import collections
import pickle
freq_dict = collections.Counter()
dfn_dict = dict()
def save_dict(name, filename):
pickle.dump(name, open('{0}.p'.format(filename), 'wb'))
#Might be a problem with this
def load_dict(name, filename):
name = pickle.load(open('{0}.p'.format(filename), 'rb'))
def cleanedup(fh):
for line in fh:
word = ''
for character in line:
if character in string.ascii_letters:
word += character
else:
yield word
word = ''
#Opens a foreign language textfile and adds all unique
#words in it, to a Counter, ordered by frequency
def process_book(textname):
with open (textname) as doc:
freq_dict.update(cleanedup(doc))
save_dict(freq_dict, 'svd_f_dict')
#Shows the Nth most frequent word in the frequency dict
def show_Nth_word(N):
load_dict(freq_dict, 'svd_f_dict')
return freq_dict.most_common()[N]
#Shows the first N most frequent words in the freq. dictionary
def show_N_freq_words(N):
load_dict(freq_dict, 'svd_f_dict')
return freq_dict.most_common(N)
#Presents a word to the user, allows user to define it
#adds the word and its definition to another dictionary
#which is used to store only the word and its definition
def define_word(word):
load_dict(freq_dict, 'svd_f_dict')
load_dict(dfn_dict, 'svd_d_dict')
if word in freq_dict:
definition = (input('Please define ' + str(word) + ':'))
dfn_dict[word] = definition
else:
return print('Word not in dictionary!')
save_dict(dfn_dict, 'svd_d_dict')
And here is an attempt to pull Nth common words out, using both methods (manual and function): 这是尝试使用两种方法(手动和函数)将第N个常用词抽出的尝试:
from dictionary import *
import pickle
#Manual, works
freq_dict = pickle.load(open('svd_f_dict.p', 'rb'))
print(freq_dict.most_common()[2])
#Using a function defined in the other file, doesn't work
word = show_Nth_word(2)
Thanks for your help! 谢谢你的帮助!
Your load_dict function stores the result of unpickling into a local variable 'name'. 您的load_dict函数将取消提取的结果存储到本地变量“名称”中。 This will not modify the object that you passed as a parameter to the function.
这不会修改您作为参数传递给函数的对象。
Instead, you need to return the result of calling pickle.load() from your load_dict() function: 相反,您需要从load_dict()函数返回调用pickle.load()的结果:
def load_dict(filename):
return pickle.load(open('{0}.p'.format(filename), 'rb'))
And then assign it to your variable: 然后将其分配给您的变量:
freq_dict = load_dict('svd_f_dict')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.