If I use python's multiprocessing.Array to create a 1G shared array, I find that the python process uses around 30G of memory during the call to multiprocessing.Array and then decreases memory usage after that. I'd appreciate any help to figure out why this is happening and to work around it.
Here is code to reproduce it on Linux, with memory monitored by smem:
import multiprocessing
import ctypes
import numpy
import time
import subprocess
import sys
def get_smem(secs,by):
for t in range(secs):
print subprocess.check_output("smem")
sys.stdout.flush()
time.sleep(by)
def allocate_shared_array(n):
data=multiprocessing.Array(ctypes.c_ubyte,range(n))
print "finished allocating"
sys.stdout.flush()
n=10**9
secs=30
by=5
p1=multiprocessing.Process(target=get_smem,args=(secs,by))
p2=multiprocessing.Process(target=allocate_shared_array,args=(n,))
p1.start()
p2.start()
print "pid of allocation process is",p2.pid
p1.join()
p2.join()
p1.terminate()
p2.terminate()
Here is output:
pid of allocation process is 2285
PID User Command Swap USS PSS RSS
2116 ubuntu top 0 700 773 1044
1442 ubuntu -bash 0 2020 2020 2024
1751 ubuntu -bash 0 2492 2528 2700
2284 ubuntu python test.py 0 1080 4566 11924
2286 ubuntu /usr/bin/python /usr/bin/sm 0 4688 5573 7152
2276 ubuntu python test.py 0 4000 8163 16304
2285 ubuntu python test.py 0 137948 141431 148700
PID User Command Swap USS PSS RSS
2116 ubuntu top 0 700 773 1044
1442 ubuntu -bash 0 2020 2020 2024
1751 ubuntu -bash 0 2492 2528 2700
2284 ubuntu python test.py 0 1188 4682 12052
2287 ubuntu /usr/bin/python /usr/bin/sm 0 4696 5560 7160
2276 ubuntu python test.py 0 4016 8174 16304
2285 ubuntu python test.py 0 13260064 13263536 13270752
PID User Command Swap USS PSS RSS
2116 ubuntu top 0 700 773 1044
1442 ubuntu -bash 0 2020 2020 2024
1751 ubuntu -bash 0 2492 2528 2700
2284 ubuntu python test.py 0 1188 4682 12052
2288 ubuntu /usr/bin/python /usr/bin/sm 0 4692 5556 7156
2276 ubuntu python test.py 0 4016 8174 16304
2285 ubuntu python test.py 0 21692488 21695960 21703176
PID User Command Swap USS PSS RSS
2116 ubuntu top 0 700 773 1044
1442 ubuntu -bash 0 2020 2020 2024
1751 ubuntu -bash 0 2492 2528 2700
2284 ubuntu python test.py 0 1188 4682 12052
2289 ubuntu /usr/bin/python /usr/bin/sm 0 4696 5560 7160
2276 ubuntu python test.py 0 4016 8174 16304
2285 ubuntu python test.py 0 30115144 30118616 30125832
PID User Command Swap USS PSS RSS
2116 ubuntu top 0 700 771 1044
1442 ubuntu -bash 0 2020 2020 2024
1751 ubuntu -bash 0 2492 2527 2700
2284 ubuntu python test.py 0 1192 4808 12052
2290 ubuntu /usr/bin/python /usr/bin/sm 0 4700 5481 7164
2276 ubuntu python test.py 0 4092 8267 16304
2285 ubuntu python test.py 0 31823696 31827043 31834136
PID User Command Swap USS PSS RSS
2116 ubuntu top 0 700 771 1044
1442 ubuntu -bash 0 2020 2020 2024
1751 ubuntu -bash 0 2492 2527 2700
2284 ubuntu python test.py 0 1192 4808 12052
2291 ubuntu /usr/bin/python /usr/bin/sm 0 4700 5481 7164
2276 ubuntu python test.py 0 4092 8267 16304
2285 ubuntu python test.py 0 31823696 31827043 31834136
Process Process-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "test.py", line 17, in allocate_shared_array
data=multiprocessing.Array(ctypes.c_ubyte,range(n))
File "/usr/lib/python2.7/multiprocessing/__init__.py", line 260, in Array
return Array(typecode_or_type, size_or_initializer, **kwds)
File "/usr/lib/python2.7/multiprocessing/sharedctypes.py", line 115, in Array
obj = RawArray(typecode_or_type, size_or_initializer)
File "/usr/lib/python2.7/multiprocessing/sharedctypes.py", line 88, in RawArray
result = _new_value(type_)
File "/usr/lib/python2.7/multiprocessing/sharedctypes.py", line 63, in _new_value
wrapper = heap.BufferWrapper(size)
File "/usr/lib/python2.7/multiprocessing/heap.py", line 243, in __init__
block = BufferWrapper._heap.malloc(size)
File "/usr/lib/python2.7/multiprocessing/heap.py", line 223, in malloc
(arena, start, stop) = self._malloc(size)
File "/usr/lib/python2.7/multiprocessing/heap.py", line 120, in _malloc
arena = Arena(length)
File "/usr/lib/python2.7/multiprocessing/heap.py", line 82, in __init__
self.buffer = mmap.mmap(-1, size)
error: [Errno 12] Cannot allocate memory
From the format of your print statements, you are using python 2
Replace range(n)
by xrange(n)
to save some memory.
data=multiprocessing.Array(ctypes.c_ubyte,xrange(n))
(or use python 3)
1 billion range takes roughly 8GB (well I just tried that on my windows PC and it froze: just don't do that!)
Tried with 10**7 instead just to be sure:
>>> z=range(int(10**7))
>>> sys.getsizeof(z)
80000064 => 80 Megs! you do the math for 10**9
A generator function like xrange
takes no memory since it provides the values one by one when iterated upon.
In Python 3, they must have been fed up by those problems, figured out that most people used range
because they wanted generators, killed xrange
and turned range
into a generator. Now if you really want to allocate all the numbers you have to to list(range(n))
. At least you don't allocate one terabyte by mistake!
Edit:
The OP comment means that my explanation does not solve the problem. I have made some simple tests on my windows box:
import multiprocessing,sys,ctypes
n=10**7
a=multiprocessing.RawArray(ctypes.c_ubyte,range(n)) # or xrange
z=input("hello")
Ramps up to 500Mb then stays at 250Mb with python 2 Ramps up to 500Mb then stays at 7Mb with python 3 (which is strange since it should at least be 10Mb...)
Conclusion: ok, it peaks at 500Mb, so that's not sure it will help, but can you try your program on Python 3 and see if you have less overall memory peaks?
Unfortunately, the problem is not so much with the range, as I just put that in as a simple illustration. In reality, that data will be read from disk. I could also use n*["a"] and specify c_char in multiprocessing.Array as another example. That still uses around 16G when I'm only have 1G of data in the list I'm passing to multiprocessing.Array. I'm wondering if there is some inefficient pickling going on or something like that.
I seem to have found a workaround for what I need by using tempfile.SpooledTemporaryFile and numpy.memmap . I can open a memory map to a temp file in memory, which is spooled to disk when necessary, and share that among different processes by passing it as an argument to multiprocessing.Process.
I'm still wondering what is going on with multiprocessing.Array though. I don't know why it would use 16G for a 1G array of data.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.