简体   繁体   中英

Python randomly drops to 0% CPU usage, causing the code to “hang up”, when handling large numpy arrays?

I have been running some code, a part of which loads in a large 1D numpy array from a binary file, and then alters the array using the numpy.where() method.

Here is an example of the operations performed in the code:

import numpy as np
num = 2048
threshold = 0.5

with open(file, 'rb') as f:
    arr = np.fromfile(f, dtype=np.float32, count=num**3)
    arr *= threshold

arr = np.where(arr >= 1.0, 1.0, arr)
vol_avg = np.sum(arr)/(num**3)

# both arr and vol_avg needed later

I have run this many times (on a free machine, ie no other inhibiting CPU or memory usage) with no issue. But recently I have noticed that sometimes the code hangs for an extended period of time, making the runtime an order of magnitude longer. On these occasions I have been monitoring %CPU and memory usage (using gnome system monitor), and found that python's CPU usage drops to 0%.

Using basic prints in between the above operations to debug, it seems to be arbitrary as to which operation causes the pausing (ie open(), np.fromfile(), np.where() have each separately caused a hang on a random run). It is as if I am being throttled randomly, because on other runs there are no hangs.

I have considered things like garbage collection or this question , but I cannot see any obvious relation to my problem (for example keystrokes have no effect).

Further notes: the binary file is 32GB, the machine (running Linux) has 256GB memory. I am running this code remotely, via an ssh session.

EDIT: This may be incidental, but I have noticed that there are no hang ups if I run the code after the machine has just been rebooted. It seems they begin to happen after a couple of runs, or at least other usage of the system.

np.where is creating a copy there and assigning it back into arr . So, we could optimize on memory there by avoiding a copying step, like so -

vol_avg = (np.sum(arr) - (arr[arr >=  1.0] - 1.0).sum())/(num**3)

We are using boolean-indexing to select the elements that are greater than 1.0 and getting their offsets from 1.0 and summing those up and subtracting from the total sum. Hopefully the number of such exceeding elements are less and as such won't incur anymore noticeable memory requirement. I am assuming this hanging up issue with large arrays is a memory based one.

The drops in CPU usage were unrelated to python or numpy, but were indeed a result of reading from a shared disk, and network I/O was the real culprit. For such large arrays, reading into memory can be a major bottleneck.

Did you click or select the Console window? This behavior can "hang" the process. Console enters "QuickEditMode". Pressing any key can resume the process.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM