简体   繁体   中英

Memory leak using pandas dataframe

I am using pandas.DataFrame in a multi-threaded code (actually a custom subclass of DataFrame called Sound ). I have noticed that I have a memory leak, since the memory usage of my program augments gradually over 10mn, to finally reach ~100% of my computer memory and crash.

I used objgraph to try tracking this leak, and found out that the count of instances of MyDataFrame is going up all the time while it shouldn't : every thread in its run method creates an instance, makes some calculations, saves the result in a file and exits ... so no references should be kept.

Using objgraph I found that all the data frames in memory have a similar reference graph :

在此输入图像描述

I have no idea if that's normal or not ... it looks like this is what is keeping my objects in memory. Any idea, advice, insight ?

Confirmed that there's some kind of memory leak going on in the indexing infrastructure. It's not caused by the above reference graph. Let's move the discussion to GitHub (SO is for Q&A):

https://github.com/pydata/pandas/issues/2659

EDIT: this actually appears to not be a memory leak at all, but has to do with the OS memory allocation issues perhaps. Please have a look at the github issue for more information

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM