简体   繁体   English

如何在不同的 GPU 上并行随机搜索超参数?

[英]How to do random search for hyperparameters on different GPUs in parallel?

Assuming my model uses only one GPU but Virtual Machine has 4.假设我的模型只使用一个 GPU,但虚拟机有 4 个。

How to leverage all GPUs for this code?如何为这段代码利用所有 GPU?

channel_1_range = [8, 16, 32, 64]
channel_2_range = [8, 16, 32, 64]
kernel_size_1_range = [3, 5, 7]
kernel_size_2_range = [3, 5, 7]
max_count = 40
for count in range(max_count):
    reg = 10**np.random.uniform(-3, 0)
    learning_rate = 10**np.random.uniform(-6, -3)
    channel_1 = channel_1_range[np.random.randint(low=0, high=len(channel_1_range))]
    channel_2 = channel_2_range[np.random.randint(low=0, high=len(channel_2_range))]
    kernel_size_1 = kernel_size_1_range[np.random.randint(low=0, high=len(kernel_size_1_range))]
    kernel_size_2 = kernel_size_2_range[np.random.randint(low=0, high=len(kernel_size_2_range))]

    model = ThreeLayerConvNet(in_channel=3, channel_1=channel_1, kernel_size_1=kernel_size_1, \
        channel_2=channel_2, kernel_size_2=kernel_size_2, num_classes=10)
    optimizer = optim.Adam(model.parameters(), lr=learning_rate)
    engine = Engine(loader_train=loader_train, loader_val=loader_val, device=device, dtype=dtype, print_every=100, \
        verbose=False)
    engine.train(model, optimizer, epochs=1, reg=reg)

    print("Reg: {0:.2E}, LR: {1:.2E}, Ch_1: {2:2} [{4}], Ch_2: {3:2} [{5}], Acc: {6:.2f} [{7:.2f}], {8:.2f} secs". \
         format(reg, learning_rate, channel_1, channel_2, kernel_size_1, kernel_size_2, \
               engine.accuracy, engine.accuracy_train, engine.duration))

One option is to move this to standalone console app, start N instances (N == number of GPUs) and aggregate results (one output file).一种选择是将其移动到独立的控制台应用程序,启动 N 个实例(N == 数量的 GPU)并聚合结果(一个输出文件)。

Is it possible to do it directly in Python so I could continue to use jupyter notebook?是否可以直接在 Python 中执行此操作,以便我可以继续使用 jupyter notebook?

In pytorch you can distribute your models on different GPUs.在 pytorch 中,您可以将模型分布在不同的 GPU 上。 I think in your case it's the device parameter that allows you to specify the actual GPU:我认为在您的情况下,它是允许您指定实际 GPU 的设备参数:

device1 = torch.device('cuda:0')
device2 = torch.device('cuda:1')
             .
             .
             .
devicen = torch.device('cuda:n')

I don't remember the exact details but if my memory serves me well, you might need to make your code non-blocking by using threading or multiprocessing (better go with multiprocessing to be sure, the GIL might cause you some problems otherwise if you fully utilize your process).我不记得确切的细节,但如果我没记错的话,您可能需要使用线程或多处理使您的代码非阻塞(最好使用多处理以确保,GIL 可能会给您带来一些问题,否则如果您充分利用您的流程)。 In your case that would mean to parallelise your for loop.在您的情况下,这意味着并行化您的for循环。 For instance by having Queue containing all models and then spawning threads/ processes, allowing you to consume them (where the number of spawned processed, working on the queue, correspond to a GPU each).例如,通过让Queue包含所有模型,然后产生线程/进程,允许您使用它们(产生处理的数量,在队列上工作,每个对应一个 GPU)。

So to answer your question, yes you can do it in pure Python (I did a while back, so I'm 100% positive).所以要回答你的问题,是的,你可以用纯 Python 来做(我之前做过,所以我 100% 肯定)。 You can even let one GPU process multiple models (but make sure to calculate your VRAM correctly beforehand).您甚至可以让一个 GPU 处理多个模型(但请确保事先正确计算您的 VRAM)。 Whether it's actually worth it, compared to just starting multiple jobs is up to you though.与仅仅开始多个工作相比,这是否真的值得,但取决于您。

As a little sidenote, if you run it as 'standalone' script, it might still use the same GPU if the GPU number isn't automatically adjusted, otherwise PyTorch might try using DataParallel distribution...附带说明一下,如果您将其作为“独立”脚本运行,如果 GPU 数量未自动调整,它可能仍会使用相同的 GPU,否则 PyTorch 可能会尝试使用 DataParallel 分发...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM