简体   繁体   English

Python:保存的腌制计数器具有数据,但是无法使用函数加载文件

[英]Python: Saved pickled Counter has data, but cannot load the file with a function

I'm trying to build a foreign language frequency dictionary/vocab learner. 我正在尝试构建外语频率词典/词汇学习器。

I want the program to: 我希望程序执行以下操作:

  1. Process a book/text-file, breaking up the text into individual unique words and ordering them by frequency (I do this using Counter() ) 处理书/文本文件,将文本分解成单个唯一的单词,然后按频率排序(我使用Counter()进行此操作)
  2. Save the Counter() to a pickle file so that I don't have to process the book every time I run the program Counter()保存到一个pickle文件中,这样我就不必在每次运行该程序时都要处理这本书
  3. Access the pickle file and pull out Nth most frequent words (easily done using most_common() function) 访问pickle文件并提取第N个最常用的单词(使用most_common()函数轻松完成)

Here is the problem, once I process a book and save it to a pickle file, I cannot access it again. 这是问题所在,一旦我处理完一本书并将其保存到泡菜文件中,就无法再次访问它。 The function that does so, loads an empty dictionary even though, when I check the pickle file, I can see that it does have data. 这样做的功能会加载一个空字典,即使当我检查pickle文件时,我也可以看到它确实有数据。

Further more, if I load the pickle file manually (using pickle.load() ) and pull the Nth most common word manually (using most_common() manually instead of a custom function which loads the pickle and pulls the Nth most common word) it will work perfectly. 此外,如果我手动加载pickle文件(使用pickle.load() )并手动拉出第N个最常用的单词(手动使用most_common()而不是自定义函数来加载pickle并拉出第N个最常见的单词),将完美地工作。

I suspect there is something wrong with the custom function that loads pickle files, but I can't figure out what it is. 我怀疑加载泡菜文件的自定义函数出了点问题,但我不知道这是什么。

Here is the code: 这是代码:

import string
import collections
import pickle

freq_dict = collections.Counter()
dfn_dict = dict()

def save_dict(name, filename):
    pickle.dump(name, open('{0}.p'.format(filename), 'wb'))

#Might be a problem with this
def load_dict(name, filename):
    name = pickle.load(open('{0}.p'.format(filename), 'rb'))

def cleanedup(fh):
    for line in fh:
        word = ''
        for character in line:
            if character in string.ascii_letters:
                word += character
            else:
                yield word
                word = ''

#Opens a foreign language textfile and adds all unique
#words in it, to a Counter, ordered by frequency
def process_book(textname):
    with open (textname) as doc:
        freq_dict.update(cleanedup(doc))
    save_dict(freq_dict, 'svd_f_dict')

#Shows the Nth most frequent word in the frequency dict
def show_Nth_word(N):
    load_dict(freq_dict, 'svd_f_dict')
    return freq_dict.most_common()[N]

#Shows the first N most frequent words in the freq. dictionary    
def show_N_freq_words(N):    
    load_dict(freq_dict, 'svd_f_dict')
    return freq_dict.most_common(N)

#Presents a word to the user, allows user to define it
#adds the word and its definition to another dictionary
#which is used to store only the word and its definition
def define_word(word):
    load_dict(freq_dict, 'svd_f_dict')
    load_dict(dfn_dict, 'svd_d_dict')
    if word in freq_dict:
        definition = (input('Please define ' + str(word) + ':'))
        dfn_dict[word] = definition
    else:
        return print('Word not in dictionary!')
    save_dict(dfn_dict, 'svd_d_dict')

And here is an attempt to pull Nth common words out, using both methods (manual and function): 这是尝试使用两种方法(手动和函数)将第N个常用词抽出的尝试:

from dictionary import *
import pickle

#Manual, works
freq_dict = pickle.load(open('svd_f_dict.p', 'rb'))
print(freq_dict.most_common()[2])

#Using a function defined in the other file, doesn't work
word = show_Nth_word(2)

Thanks for your help! 谢谢你的帮助!

Your load_dict function stores the result of unpickling into a local variable 'name'. 您的load_dict函数将取消提取的结果存储到本地变量“名称”中。 This will not modify the object that you passed as a parameter to the function. 这不会修改您作为参数传递给函数的对象。

Instead, you need to return the result of calling pickle.load() from your load_dict() function: 相反,您需要从load_dict()函数返回调用pickle.load()的结果:

def load_dict(filename):
    return pickle.load(open('{0}.p'.format(filename), 'rb'))

And then assign it to your variable: 然后将其分配给您的变量:

freq_dict = load_dict('svd_f_dict')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM