简体   繁体   中英

MemoryError: Unable to allocate MiB for an array with shape and data type, when using anymodel.fit() in sklearn

Getting this memory error. But the book/link I am following doesn't get this error.

A part of Code:

from sklearn.linear_model import SGDClassifier
sgd_clf = SGDClassifier()
sgd_clf.fit(x_train, y_train)

Error: MemoryError: Unable to allocate 359. MiB for an array with shape (60000, 784) and data type float64

I also get this error when I try to scale the data using StandardScaler's fit_transfrom

But works fine in both if I decrease the size of training set (something like: x_train[:1000] , y_train[:1000] )

Link for the code in the book here . The error I get is in Line 60 and 63 ( In [60] and In [63] )

The book: Aurélien Géron - Hands-On Machine Learning with Scikit-Learn Keras and Tensorflow 2nd Ed (Page: 149 / 1130)

So here's my question:

Does this has anything to do with my ram? and what does "Unable to allocate 359" mean? is it the memory size?

Just in case my specs: CPU - ryzen 2400g, ram - 8gb (3.1gb is free when using jupyter notebook)

Upgrading python-64 bit seems to have solved all the "Memory Error" problem.

The message is straight forward, yes, it has to do with the available memory.

359 MiB = 359 * 2^20 bytes = 60000 * 784 * 8 bytes

where MiB = Mebibyte = 2^20 bytes, 60000 x 784 are the dimensions of your array and 8 bytes is the size of float64.

Maybe the 3.1gb free memory is very fragmented and it is not possible to allocate 359 MiB in one piece?

A reboot may be helpful in that case.

did you try converting to smaller sized data types? float64 to float32 or if possible np.uint8?

Pred['train'] = Pred['train'].astype(np.uint8,errors='ignore')

It seems that this error code has something to do with memory not being released after being used. I use PyCharm and a few dozens Python apps running continuously. I use 40 GB RAM and got this message for the data frame (39 x 2272) which has been processed once a minute for days or weeks with no problem. At the same time I got a memory error on another two Python apps (ran within the same PyCharm app): pandas.errors.ParserError: Error tokenizing data. C error: out of memory

Just Restart Your PC After Closing All Tabs And Softwares.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM