Does fitting a sklearn Linear Regression classifier multiple times add data points or just replace them?

Question

X = np.array(df.drop([label], 1))
X_lately = X[-forecast_out:]
X = X[:-forecast_out]
df.dropna(inplace=True)
y = np.array(df[label])

X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2)

linReg.fit(X_train, y_train)

I've been fitting my linear regression classifier over and over again with data from different spreadsheets under the assumption that every time I fit the same model with a new spreadsheet, it is adding points and making the model more robust.

Was this assumption correct? Or am I just wiping the model every time I fit it?

If so, is there a way for me to fit my model multiple times for this 'cumulative' type effect?

Answer 1

Linear regression is a batch (aka. offline) training method, you can't add knowledge with new patterns. So, sklearn is re-fitting the whole model. The only way to add data is to append the new patterns to your original training X, Y matrices and re-fit.

Answer 2

You're almost certainly wiping your mode land starting from scratch. To do what you want, you need to append the additional data to the bottom of your data frame and re-fit using that.

Does fitting a sklearn Linear Regression classifier multiple times add data points or just replace them?

Question

2 answers

solution1
3 ACCPTED 2017-05-30 14:03:58

solution2
0 2017-05-30 14:01:55

Does fitting a sklearn Linear Regression classifier multiple times add data points or just replace them?

Question

2 answers

solution1 3 ACCPTED 2017-05-30 14:03:58

solution2 0 2017-05-30 14:01:55

solution1
3 ACCPTED 2017-05-30 14:03:58

solution2
0 2017-05-30 14:01:55