简体   繁体   中英

Running out of RAM while running python script

I am running the following script in a google colab cell:

fs = 256 #samples per second
import gc

#because of ram restriction, runs 20 datasets per time
dataset = []
length = len(detail['dataset']) #length is 123
for i in range(length):
    name = detail['dataset'][i]
    start = detail['seizure start'][i] * 256
    end = detail['seizure end'][i] * 256
    f = pyedflib.EdfReader(name)
    n = f.signals_in_file
    signal_labels = f.getSignalLabels()
    sigbufs = np.zeros((n, f.getNSamples()[0]))
    for j in np.arange(n):
            sigbufs[j, :] = f.readSignal(j)
    l = sigbufs.shape[-1]
    t = np.linspace(0,l/fs,l)
    f.close()
    start = start - 100
    end = end + 100
    dataset.append([t[start:end], sigbufs[:,start:end]])
    print("completed run " + str(i) + " out of " + str(length))
    del sigbufs
    del f
    gc.collect()

import pickle
with open("dataset.txt",'wb') as fp:
    pickle.dump(dataset, fp)

Intuitively, I thought that every time the cycle happens, the program deletes the previous values for sigbufs and f , the data and the object containing the data respectively. Apparently this was not the case as the RAM in google colab crashed, thus adding the del sigbufs and del f at the end but this didn't work either.

Is there a way to clear the ram and not make it crash? if I do it manually, say if I do 20 datasets at a time the RAM does not crash (because it can handle that amount).

NOTE : The finalized bit that I want to save is not that large, it's the actual dataset that is relatively large.

I have also faced similar problem in one of my machine learning training codes that works with multiple datasets in a loop. Python's garbage collector is pretty lame in the case that you are facing. I would suggest you to use another python file (script file) to run your current python file (main file) without the loop in the current main file. You will provide the loop in the new script file and also merge everything in that script file. In that way, iterations that you are performing in your current main file will be performed as separate runs via the script file and your problem will be eliminated.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM