multiprocessing in Logistic Regression in Python

Question

I am using LogisticRegression algorithm

it works fine, except it is taking long time to finish

I decided to use multiprocessing feature (n_jobs=-1) as per https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

but no change in the performance

Here is my code

mdl = LogisticRegression(n_jobs=-1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
mdl.fit(X_train,y_train)
y_pred=mdl.predict(X_test)

How can I use it on LogisticRegression?

Answer 1

Are you doing multiclass classification? If your data does not have more than two classes, setting the n_jobs argument is virtually useless.

To improve speed try feature engineering to reduce the number of features.

You could also try changing the solver. Here's what the documentation says:
"For small datasets, 'liblinear' (used to be the default) is a good choice, whereas 'sag' and 'saga' are faster for large ones. For multiclass problems, only 'newton-cg', 'sag', 'saga' and 'lbfgs' handle multinomial loss; 'liblinear' is limited to one-versus-rest schemes."

There are also some parameters like tol you could try changing.

Finally, if nothing works, use another model.

multiprocessing in Logistic Regression in Python

Question

1 answers

solution1
0 2021-10-29 01:06:21

multiprocessing in Logistic Regression in Python

Question

1 answers

solution1 0 2021-10-29 01:06:21

solution1
0 2021-10-29 01:06:21