I have code which looks like this:
def get_image_stats(fp):
img = cv2.imread(fp)
return img.shape[0], img.shape[1], img.shape[0]/img.shape[1]
with ThreadPool(16) as pool:
res = list(tqdm(pool.imap_unordered(get_image_stats, df.file_path), total=len(df)))
heights, widths, ars = list(zip(*res))
The only library specific part there is cv2.imread
which is simply loading an image file into a numpy array, so it's I/O bound.
Why would my CPU usage look like this?
Notes on that image:
Another note: I did not set n_workers to 16 because I have 16 cores. Just a coincidence.
So why is this using up 75% of 16 cores at once?
Because your thread pool is going to use 1 core per thread if it can. That's what gives maximum parallelism and maximizes throughput.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.