简体   繁体   中英

Memory leak(?) Python 3.2

Hi I am new to python and I have read enough posts for the specific subject but none has a specific answer. (using py 64-bit 3.2 edition)

I have a big input which I read it inside a loop and as I read the file I create groups which I append to a List . I process the List and then store it inside a file. I unreffer the List (List = None) and I delete it. I even call gc collector manually. Problem is that the memory is still been used. Swap space and Ram go wild.

for line in file: # read line by line
        temp_buffer = line.split() # split elements
        for word in temp_buffer: #enumerate (?)
             if not l1: # list is empty
                 l1.append(str(word)) #store '-' to list 
             else:      # list is not empty
                 tempp = l1.pop(0)
                 l1.insert(0,"-0")
                 l1.sort(key=int)
                 l2 = term_compress(l1)

                 l1 = None # delete referrences
                 del l1    # delete struct

                 print(" ".join(str(i) for i in l2) , file=testfile) # print for every term in file
                 l2 = None # delete referrences
                 del l2    # delete struct

                 gc.collect() # run garbagge collector (free RAM)
                 l1 = [] 
                 l2 = []
                 l1.append(str(word))

What am I doing wrong ?

edit

example input:

-a 1 2 3 4 5 6 7 8 9 10

-n 7 8 9 10 11 12 13 14 15 ...

output

-a 1# 10#

-n 7# 15#

It's most likely not a memory/reference leak in the traditional sense of a programming error. What you're likely seeing is the underlying C runtime aggressively holding on to heap memory that Python allocated on your behalf during the loop. It's anticipating that you might need to use that memory again. Holding onto it is cheaper than giving it back to the OS kernel only to ask for it again and again.

So, in short, even though your objects have been garbage collected in the Python runtime , the underlying C runtime hangs onto the heap memory in case your program needs it again.

From the glibc documentation:

Occasionally, free can actually return memory to the operating system and make the process smaller. Usually, all it can do is allow a later call to malloc to reuse the space. In the meantime, the space remains in your program as part of a free-list used internally by malloc.

In that sense, your memory utilization reported by the OS after the loop is basically your "peak memory utilization". If you think it's too high, then you have to consider redesigning your program to limit its peak memory usage. This is typically accomplished using some kind of streaming or buffering design where you're operating on smaller chunks of the data at a time.

Disclaimer, the above is a layman's version and is obviously implementation specific to various flavors of Python, C, and the OS.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM