简体繁体中英

Train a logistic regression model in parts for big data

原文 2019-02-18 02:36:50 3 1 python/ machine-learning/ nlp/ bigdata/ logistic-regression

My data set consists of 1.6 million rows and 17000 columns after preprocessing. I want to use logistic regression on this data, however the process gets killed everytime I load the dataset. Is there a way I can train a logistic regression model in chunks, wit the coefficients being updated at each iteration. Does sklearn support any technique for my problem?

1 answers

first, please read this . the time to train a LR on your data set is.... a bit high. to avoid that, you can use the warm start param of LR in sklearn and loop over chunck of your datas.

warm_start : bool, default: False When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver. See the Glossary.

(from here )

and to be more precise:

warm_start When fitting an estimator repeatedly on the same dataset, but for multiple parameter values (such as to find the value maximizing performance as in grid search), it may be possible to reuse aspects of the model learnt from the previous parameter value, saving time. When warm_start is true, the existing fitted model attributes an are used to initialise the new model in a subsequent call to fit .

(from here )

splitting data into test and train, making a logistic regression model in pandas

Logistic regression sklearn - train and apply model

Train a logistic regression with regularization model from scratch

How to make predictions on a logistic regression model with a separate df for train and test data

train logistic regression model with different feature dimension in scikit learn

How to train a highly unbalanced data for link prediction using logistic regression

Logistic regression model coefficient

logistic regression in python, Test set and Train set

How to retrain logistic regression model in sklearn with new data

sklearn Logistic Regression has too little accuracy even if I try to predict with the train data

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question splitting data into test and train, making a logistic regression model in pandas Logistic regression sklearn - train and apply model Train a logistic regression with regularization model from scratch How to make predictions on a logistic regression model with a separate df for train and test data train logistic regression model with different feature dimension in scikit learn How to train a highly unbalanced data for link prediction using logistic regression Logistic regression model coefficient logistic regression in python, Test set and Train set How to retrain logistic regression model in sklearn with new data sklearn Logistic Regression has too little accuracy even if I try to predict with the train data

Related Tags

Train a logistic regression model in parts for big data

Question

1 answers

solution1 0 2019-02-18 09:40:45

solution1
0 2019-02-18 09:40:45