Use Class in a loop- python

Question

I'm new to using Classes in Python, and could use some guidance on what resources to consult/how to use a class in a loop.

Sample data:

df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
df2 = pd.DataFrame(np.random.randint(0, 1, size=(100, 1)), columns=list('E'))
df['E']= df2

here's the code outside of a class:

styles = [1, 3, 7]

def train_model(X, y):
    clf = LogisticRegression(random_state=0, C=1, penalty='l1')
    clf.fit(X, y)

for value in styles:
    X = df[['A', 
            'B',
            'C']][df['D']==value]
    y = df['E'][df['D']==value]
    train_model(X, y)

I need to translate this into a class, like so:

class BaseTrainer(object):
""" Abstract class to define run order """

    def run(self):
        self.import_training_data()
        for value in [1, 3, 7]:
            self.extract_variables(value)
            self.train_model()
            # I think there's a better way to do this
            if value = 1:
                pickle_model(self.model, self.model_file)
            if value = 3:
                pickle_model(self.model, self.model_file2)
            if value = 7:
                pickle_model(self.model, self.model_file3)


class ModelTrainer(BaseTrainer):
""" Class to train model for predicting Traits of Customers """

     def __init__(self):
        self.model_file = '/wayfair/mnt/crunch_buckets/central/data_science/customer_style/train_modern.pkl'
        self.model_file2 = '/wayfair/mnt/crunch_buckets/central/data_science/customer_style/train_traditional.pkl'
        self.model_file3 = '/wayfair/mnt/crunch_buckets/central/data_science/customer_style/train_rustic.pkl'

def import_training_data(self):
    _execute_vertica_query('get_training_data')

    self.df = _read_data('training_data.csv')
    self.df.columns = [['CuID', 'StyID', 'StyName', 
    'Filter', 'PropItemsViewed', 'PropItemsOrdered', 'DaysSinceView']]

def extract_variables(self, value):
    # Take subset of columns for training purposes (remove CuID, Segment)
    self.X = self.df[['PropItemsViewed', 'PropItemsOrdered', 
    'DaysSinceView']][df['StyID']==value]

    y = self.df[['Filter']][df['StyID']==value]
    self.y = y.flatten()

def train_model(self):
    self.model = LogisticRegression(C=1, penalty='l1')

    self.model.fit(self.X, self.y)

I think there must be a better way to structure it or run through the three different values in the styles list. But I don't even know what to search for to improve this. Any suggestions, pointers, etc. would be appreciated!

Answer 1

You could just enumerate the files like so

files = [self.model_file, self.model_file2, self.model_file3]
values = [1 ,5 ,7]
for n in range(len(value)):  
    pickle_model(self.model, files[n])

Does this answer the question?

Answer 2

An elegant way to do it is to iterate through both lists at the same time using zip

def run(self):
    self.import_training_data()
    for value,model_file in zip([1, 3, 7],[self.model_file, self.model_file2, self.model_file3]):
        self.extract_variables(value)
        self.train_model()

        pickle_model(self.model, model_file)

As for the design it could be improved

For instance, define your model files as a list directly:

self.model_list = map(lambda x : os.path.join('/wayfair/mnt/crunch_buckets/central/data_science/customer_style',x),['train_modern.pkl','train_traditional','train_rustic.pkl'])

Which gives:

def run(self):
    self.import_training_data()
    for value,model_file in zip([1, 3, 7],self.model_file_list):
        self.extract_variables(value)
        self.train_model()

Use Class in a loop- python

Question

2 answers

solution1
0 2016-08-16 18:23:48

solution2
0 ACCPTED 2016-08-16 18:36:56

Use Class in a loop- python

Question

2 answers

solution1 0 2016-08-16 18:23:48

solution2 0 ACCPTED 2016-08-16 18:36:56

solution1
0 2016-08-16 18:23:48

solution2
0 ACCPTED 2016-08-16 18:36:56