简体   繁体   中英

Ray does not run Python function in parallel on GPU

I'm using a Python package called Ray to run the example shown below in parallel. The code is run on a machine with 80 CPU cores and 4 GPUs.

import ray
import time

ray.init()

@ray.remote
def squared(x):
    time.sleep(1)
    y = x**2
    return y

tic = time.perf_counter()

lazy_values = [squared.remote(x) for x in range(1000)]
values = ray.get(lazy_values)

toc = time.perf_counter()

print(f'Elapsed time {toc - tic:.2f} s')
print(f'{values[:5]} ... {values[-5:]}')

ray.shutdown()

Output from the above example is:

Elapsed time 13.09 s
[0, 1, 4, 9, 16] ... [990025, 992016, 994009, 996004, 998001]

Below is the same example, but I would like to run it on the GPU using the num_gpus parameter. The GPUs available on the machine are Nvidia Tesla V100.

import ray
import time

ray.init(num_gpus=1)

@ray.remote(num_gpus=1)
def squared(x):
    time.sleep(1)
    y = x**2
    return y

tic = time.perf_counter()

lazy_values = [squared.remote(x) for x in range(1000)]
values = ray.get(lazy_values)

toc = time.perf_counter()

print(f'Elapsed time {toc - tic:.2f} s')
print(f'{values[:5]} ... {values[-5:]}')

ray.shutdown()

The GPU example never completed and I terminated it after several minutes. I checked the resources available to Ray using import ray; ray.init(); ray.available_resources() import ray; ray.init(); ray.available_resources() import ray; ray.init(); ray.available_resources() and it reports 80 CPUs and 4 GPUs. So it seems that Ray knows about the available GPUs.

I modified the GPU example to run fewer executions by changing range(1000) to range(10) . See the revised example below.

import ray
import time

ray.init(num_gpus=1)

@ray.remote(num_gpus=1)
def squared(x):
    time.sleep(1)
    y = x**2
    return y

tic = time.perf_counter()

lazy_values = [squared.remote(x) for x in range(10)]
values = ray.get(lazy_values)

toc = time.perf_counter()

print(f'Elapsed time {toc - tic:.2f} s')
print(f'{values[:5]} ... {values[-5:]}')

ray.shutdown()

The output from the revised GPU example is:

Elapsed time 10.06 s
[0, 1, 4, 9, 16] ... [25, 36, 49, 64, 81]

The revised GPU example completed, but it looks like Ray is not using the GPU in parallel. Is there something else I should do to get Ray to run on the GPU in parallel?

@ray.remote(num_gpus=1)

That tells ray that your function will consume the entire GPU. Thus, it runs serially. The documentation says you should specify a fractional number here to get multiprocessing:

@ray.remote(num_gpus = 0.1)

https://docs.ray.io/en/latest/using-ray-with-gpus.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM