In a python 2.7 script, a first multiprocessing code to process a big chunk a numpy
array. This is basically projection ray frameblock between an image plan and a Cartesian (world) plane. That part, called poo1
, works fine.
Further in the script, I attempt to reproduce the multiprocessing code to project a lot of images with this projection ray frameblock.
It seems that only 4 to 6 workers working but all of them is ready to work filling with data. The pool2
creates workers, they are slow growing in memory usage, only up to 6 of them are using CPU power.
Notes :
Arguments info :
A simplification of the code look like this :
def georef(paramsGeoRef):
#Pseudo workflow
"""
- unpack arguments, Frameclock, A1,A2, B1, B2, fileName, D1, D2, D3, P1, P2 <== paramsGeoRef
- Loading tif image
- Evergy convertion
with function and P1, P2
- Proportional projection of the image
- Frameclock, A1, A2
- Evergy convertion
with function and P1, P2
- Figure creation
- Geotiff creation
- export into file figure, geotiff and numpy file
"""
return None
if __name__ == '__main__':
paramsGeoRef = []
for im in imgfiles:
paramsGeoRef.append([Frameclock, A1, A2, B1, B2, fileName, D1 , D2 , D3 , P1 , P2])
if flag_parallel:
cpus = multiprocessing.cpu_count()
cpus = cpus - 1
pool2 = multiprocessing.Pool(processes=cpus)
pool2.map(georef, paramsGeoRef)
pool2.close()
pool2.join()
I tried different approaches, such as :
Unpack arguements before:
def star_georef(Frameclock, A1,A2, B1, B2, fileName, D1, D2, D3, P1, P2):
return georef(*paramsGeoRef)
def georef(paramsGeoRef):
#Pseudo workflow...
return None
Used another map type:
pool2.imap_unordered()
What wrong? Why this method work for crunching numpy
array, but not for this purpose? Need to handle a chunksize?
Maybe, I might need to feed workers as soon as they are available with a job generator?
Following Martineau advice,
I save the Frameclock, A1 and A2 arguements with with numpy in.npy format. Then I load the.npy inside the parallelized.
such as:
def georef(paramsGeoRef):
#Pseudo workflow
"""
- unpack arguments, Frameblock, A1,A2, B1, B2, fileName, D1, D2, D3, P1, P2 <== paramsGeoRef
- load Frameblock from his .npy
- load A1 from his .npy
- load A2 from his .npy
- Loading tif image
- Evergy convertion
with function and P1, P2
- Proportional projection of the image
- Frameclock, A1, A2
- Evergy convertion
with function and P1, P2
- Figure creation
- Geotiff creation
- export into file figure, geotiff and numpy file
"""
return None
Even with saving and loading these is a drastic efficiency gain. All worker works.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.