简体   繁体   English

应用策略设计模式

[英]Applying the strategy design pattern

I need to write a program for text lemmatization (different forms of words). 我需要编写一个程序来进行文本去词义化(不同形式的单词)。 Since I'm going to be using different lemmatization libraries and comparing them I've decided to use Strategy Pattern. 由于我将使用不同的词库化库并进行比较,因此我决定使用策略模式。

My idea is to wrap everything to single class and, depending on lemmatization function, only change my lemmatize method. 我的想法是将所有内容包装到单个类中,并且根据lemmatization函数,仅更改我的lemmatize方法。

Here's my class: 这是我的课:

import re
import types

create_bound_method = types.MethodType

class Lemmatizator(object):
def __init__(self, filename=None, lemmatization=None):
    if lemmatization and filename:
        self.filename = filename
        self.lemmatize = create_bound_method(lemmatization, self)

def _get_text(self):
    with open(f'texts/{self.filename}.txt', 'r') as file:
        self.text = file.read()

def _split_to_unique(self):
    text = re.sub(r'[^\w\s]', '', self.text)
    split_text = re.split(r'\s', text)

    self.unique_words = set(split_text)

    return self.unique_words

def lemmatize(self):
    return 'Lemmatize function or text are not found'

Then I'm creating my lemmatize method: 然后创建我的lemmatize方法:

def nltk_lemmatization(self):
words = {}

for word in self.unique_words:
    if word:
        words[word] = {
            'noun': wnl.lemmatize(word),
            'adverb': wnl.lemmatize(word, pos='r'),
            'adjective': wnl.lemmatize(word, pos='a'),
            'verb': wnl.lemmatize(word, pos='v')
        }

return words

And trying to apply it: 并尝试应用它:

nltk_lem = Lemmatizator('A Christmas Carol in Prose', nltk_lemmatization)
nltk_lem.lemmatize()

But I receive the following error: 但是我收到以下错误:

 for word in self.unique_words: 

AttributeError: 'Lemmatizator' object has no attribute 'unique_words' AttributeError:“ Lemmatizator”对象没有属性“ unique_words”

what's wrong? 怎么了?

From what I can see, self.unique_words is only added to the class in the _split_to_unique(self) function. 从我可以看到, self.unique_words仅添加到_split_to_unique(self)函数中的类中。 So when you're calling nltk_lemmatization(self) , _split_to_unique(self) hasn't been called yet, and as a result, it tries to iterate through something that doesn't exist. 因此,当您调用nltk_lemmatization(self)_split_to_unique(self)调用nltk_lemmatization(self) ,因此,它尝试迭代不存在的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM