Class and instance organisation in python3

Question

I have a class that stores a dataframe (df) and contains methods allowing said df to be filtered:

class File(object):

    def __init__(self, f):
        self.f = f

    def view_ref(self):
        return self.f['REF']

    def filter_ref(self, val):
        ''' Filter REF column for the given val
        '''
        f = self.f[self.f['REF'] == val]
        self.f = f
        return self.f

The problem with this approach is I want to be able to access the original pre-filtered df and filtered df after performing filtering with the filter_ref() method. However, this can't be done with the above code. I altered the class as below so the original df can be accessed at any time:

class File(object):

    def __init__(self, f):
        self.f = f
        self.filtered = None

    def view_ref(self):
        return self.f['REF']

    def filter_ref(self, val):
        ''' Filter REF column for the given val
        '''
        filtered = self.f[self.f['REF'] == val]
        self.filtered = filtered
        return self.filtered

The problem with the above approach is I will eventually have various methods that filter and select data and I would like to keep them separate. So I tried creating two different classes for each of these purposes:

class File(object):

    def __init__(self, f):
        self.f = f
        self.filter = FilterFile(f)

    def view_ref(self):
        return self.f['REF']

class FilterFile(object):

    def __init__(self, f):
        self.f = f

    def filtered_ref(self, val):
        f = self.f[self.f['REF'] == val]
        self.f = f
        return self.f

In the above example I can access the filtered df and original df within the File class and I have kept the methods for filtering and selecting data separate. The problem now is I can't use the File methods, view_ref(), with the self.filter instance.

I am having a hard time determining how to best organise this code. Can someone help point me in the most pythonic direction to organise this?

Answer 1

There's no single solution for all situations you could possibly have.

You can use immutable File object to access your data and any functions which modify your data will produce new object with new data. This is useful when filtered and original data has same properties or at least we know which class we should produce after filtering. Good examples of this strategy is numbers ( int , float , Decimal ) and strings-like objects ( str , bytes ).

The sample class for this solution you can see below. I used here filter_function as an argument to give more runtime flexibility:

class File:
   def __init__(self, df, filter_function=None):
       self.df = df
       self.filter_function = filter_function

   def view_field(self, field):
       return self.df[field]

   def filter(self, *args, filter_function=None):
       """ Returns filtered data in new File object. """

       if not self.filter_function:
            return self
       filtered = self.filter_function(self.df, *args)  # filter our data
       return File(filtered, filter_function)

Class and instance organisation in python3

Question

1 answers

solution1
0 2017-04-09 15:45:39

Class and instance organisation in python3

Question

1 answers

solution1 0 2017-04-09 15:45:39

solution1
0 2017-04-09 15:45:39