简体繁体中英

Scaling issues with scipy.sparse matrix while using scikit

原文 2013-11-27 10:36:37 2 1 python/ machine-learning/ scikit-learn

While solving a machine learning problem using scikit (python) I need to do scaling of scipy.sparse matrix before doing the training using SVM in order to achieve higher accuracy. But its clearly mentioned here , that:

scale and StandardScaler accept scipy.sparse matrices as input only when with_mean=False is explicitly passed to the constructor. Otherwise a ValueError will be raised as silently centering would break the sparsity and would often crash the execution by allocating excessive amounts of memory unintentionally.

This means that I cannot have zero mean with this. So how do I scale this sparse matrix so that it has zero mean too along with unit variance. I also need to store this 'scaling' so that I can use the same transformation on the test matrix to scale it as well.

1 answers

If the matrix is small, you can densify it with X.toarray() . If the matrix is large, then this will probably blow your RAM.

As an alternative to mean-centering and scaling, you can try per-sample normalization with sklearn.preprocessing.Normalizer ; this is appropriate for frequency features (eg in text classification).

Tridiagonal block matrix using scipy.sparse

Creating a large sparse matrix in scipy.sparse

Apply a convolution to a scipy.sparse matrix

What is a scipy.sparse matrix in the CSR format?

sFrame into scipy.sparse csr_matrix

Vectorization of index operation for a scipy.sparse matrix

Iterating through a scipy.sparse vector (or matrix)

Scipy.sparse CSC-matrix performance

Determining the byte size of a scipy.sparse matrix?

Converting a scipy.sparse matrix into an equivalent MATLAB sparse matrix

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Tridiagonal block matrix using scipy.sparse Creating a large sparse matrix in scipy.sparse Apply a convolution to a scipy.sparse matrix What is a scipy.sparse matrix in the CSR format? sFrame into scipy.sparse csr_matrix Vectorization of index operation for a scipy.sparse matrix Iterating through a scipy.sparse vector (or matrix) Scipy.sparse CSC-matrix performance Determining the byte size of a scipy.sparse matrix? Converting a scipy.sparse matrix into an equivalent MATLAB sparse matrix

Related Tags

Scaling issues with scipy.sparse matrix while using scikit

Question

1 answers

solution1 5 2013-11-27 11:33:31

solution1
5 2013-11-27 11:33:31