简体   繁体   中英

Reading some files and passing to a constructor by a setter

I have a class that needs a list of words to be excluded from a string:

class Cleaner():
     def __init__(self, remove_words=None):
         self.remove_words = remove_words

     def clean(self, line):
         return u' '.join[word for word in line not in self.remove_words]

and on the main file I need to read the words to be removed from the lines:

if __name__ == "__main__":

     with open('remove_words') as r:
         words = r.read().splitlines()

     cleaning = Cleaner(words)

     with open('mylines') as f:
          lines = f.read()
          for line in lines:
               print cleaning.clean(line)

So I need to open the remove_words file before creating the Clean class. But alas, I need to open several files with words to be removed and the code get messy fast. So I added a class to set the removable words on the Clean class:

 class Cleaner():
     def __init__(self, remove_words=None):
         self.remove_words = remove_words

     def set_remove_words(self, words):
         self.remove_words = words  

     def clean(self, line):
         return u' '.join[word for word in line not in self.remove_words]

so now the main code would look like

if __name__ == "__main__":

     with open('remove_words') as r:
         words = r.read().splitlines()
     # after lots of these open files...
     with open('remove_more_words') as r:
         more_words = r.read().splitlines()

     cleaning = Cleaner()
     all_removable_words = words + more_worlds
     cleaning.set_remove_words(all_removable_words)

     with open('mylines') as f:
          lines = f.read()
          for line in lines:
               print cleaning.clean(line)

but then again things can get very messy. There are cases when I will have to open and pass just one list of removable words, sometimes will be several. What would be a "pythonic" solution for this? Would move the filenames with removable words be passed to the constructor and build the lists there be more "pythonic" and less error prone? Where should the exceptions be handled?

First off, good job on keeping your file I/O out of your class. I like how you adhere to Uncle Bob's clean architecture principle. You should absolutely not move it into the constructor, since that would couple your domain-rules code to the open function, and therefore make it less reusable.

I would make use of list comprehensions and generators to be Pythonic.

if __name__ == "__main__":

     bad_word_sources = ['remove_words',...,'remove_more_words']
     bad_word_files = (open(source) for source in bad_word_sources)       
     bad_words = [word for word in chain(bad_word_files)]

     cleaning = Cleaner(bad_words)

This works because the open() function provides an __iter__ implementation that will work similar to

[line for line in file.readlines()]

When the open object is exhausted it will take care of closing itself[reference needed].

I'm not sure which kind of exceptions you would like to handle, could you be more specific on this?

Also note that set_words methods are considered unpythonic . Just set the attribute directly if you have to.


On a side note this class only has 2 methods, one of which is __init__ :

 class Cleaner():
     def __init__(self, remove_words=None):
         self.remove_words = remove_words

     def clean(self, line):
         return u' '.join[word for word in line not in self.remove_words]

The Pythonic way of making this reusable is to ditch the class and put it in a module:

cleaner.py

def clean(line, bad_words):
    return u' '.join(word for word in line if line not in self.bad_words)

You can then use it like:

from cleaner import clean

instead of:

from cleaner import Cleaner
mycleaner = Cleaner(bad_words)
mycleaner.clean(line)

which is really confusing.

There are cases when I will have to open and pass just one list of removable words, sometimes will be several. What would be a "pythonic" solution for this?

I think what you need is dynamic parameters/arguments.

class Cleaner():
     def __init__(self, *remove_words):
         self.remove_words = []
         [self.remove_words.extend(one) for one in remove_words]

     def clean(self, line):
         return u' '.join[word for word in line not in self.remove_words]

and there is another way, **argw, for a keyword arguments which you can specify a name to the argument dict.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM