简体   繁体   中英

Does fitting a sklearn Linear Regression classifier multiple times add data points or just replace them?

X = np.array(df.drop([label], 1))
X_lately = X[-forecast_out:]
X = X[:-forecast_out]
df.dropna(inplace=True)
y = np.array(df[label])

X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.2)

linReg.fit(X_train, y_train)

I've been fitting my linear regression classifier over and over again with data from different spreadsheets under the assumption that every time I fit the same model with a new spreadsheet, it is adding points and making the model more robust.

Was this assumption correct? Or am I just wiping the model every time I fit it?

If so, is there a way for me to fit my model multiple times for this 'cumulative' type effect?

Linear regression is a batch (aka. offline) training method, you can't add knowledge with new patterns. So, sklearn is re-fitting the whole model. The only way to add data is to append the new patterns to your original training X, Y matrices and re-fit.

You're almost certainly wiping your mode land starting from scratch. To do what you want, you need to append the additional data to the bottom of your data frame and re-fit using that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM