简体   繁体   English

在 GPU 上运行 BERT SQUAD model

[英]Running BERT SQUAD model on GPU

I am using the BERT Squad model to ask the same question on a collection of documents (>20,000).我正在使用 BERT Squad model 对一组文档 (>20,000) 提出相同的问题。 The model currently runs on my CPU and it takes around a minute to process a single document - which means that I'll need several days to complete the program. model 目前在我的 CPU 上运行,处理一个文档大约需要一分钟 - 这意味着我需要几天时间才能完成该程序。

I was wondering if I could speed this up by running the model on a GPU.我想知道是否可以通过在 GPU 上运行 model 来加快速度。 However, I am new to GPUs and I don't know how to send these inputs and the model to the device (Titan xp).但是,我是 GPU 新手,我不知道如何将这些输入和 model 发送到设备 (Titan xp)。

The code is borrowed from Chris McChormick.该代码是从 Chris McChormick 那里借来的。

import torch
import tensorflow as tf
from transformers import BertForQuestionAnswering
from transformers import BertTokenizer

model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')

'question' and 'answer_text' are the question and the context string respectively. 'question' 和 'answer_text' 分别是问题和上下文字符串。

    input_ids = tokenizer.encode(question, answer_text)

    # ======== Set Segment IDs ========
    # Search the input_ids for the first instance of the `[SEP]` token.
    sep_index = input_ids.index(tokenizer.sep_token_id)

    if len(input_ids)>512:
        input_ids=input_ids[:512]
  
    num_seg_a = sep_index + 1
    num_seg_b = len(input_ids) - num_seg_a

    # Construct the list of 0s and 1s.
    segment_ids = [0]*num_seg_a + [1]*num_seg_b

    # There should be a segment_id for every input token.
    assert len(segment_ids) == len(input_ids)

    # ======== Evaluate ========
    # Run our example through the model.
    outputs = model(torch.tensor([input_ids]), # The tokens representing our input text.
                    token_type_ids=torch.tensor([segment_ids]), # The segment IDs to differentiate question from answer_text
                    return_dict=True) 

    start_scores = outputs.start_logits
    end_scores = outputs.end_logits

I know that I can send the model to the GPU using model.tocuda().我知道我可以使用 model.tocuda() 将 model 发送到 GPU。 But how do I send the inputs, train the model, and the retreive output from the GPU?但是我如何发送输入,训练 model,并从 GPU 中检索 output?

It's been a while, but I'll answer anyway in the hope that maybe it will help someone.已经有一段时间了,但无论如何我都会回答,希望它可能会对某人有所帮助。 You can copy each tensor to the GPU using the to method.您可以使用to方法将每个张量复制到 GPU。 For example your batch contains 4 pytorch tensors: input ids, attention masks, segment ids and labels例如,您的批次包含 4 个 pytorch 张量:输入 ID、注意掩码、段 ID 和标签

device = torch.device("cuda")
b_input_ids = batch[0].to(device)
b_input_mask = batch[1].to(device)
b_seg_ids = batch[2].to(device)
b_labels = batch[2].to(device)

Then,You can use the .cpu() to transfer the logits and labels from the gpu back to the cpu.然后,您可以使用.cpu()将日志和标签从 gpu 传输回 cpu。 In example;例如;

start_logits = start_logits.detach().cpu()
end_logits = end_logits.detach().cpu()

or similarly to(device) you can use或与(设备)类似,您可以使用

start_logits = start_logits.to('cpu')
end_logits = end_logits.to('cpu')

Note that: Since you will be using them in the model, you will probably need to add.numpy() to the end and convert them to a numpy array.请注意:由于您将在 model 中使用它们,您可能需要在末尾添加.numpy() 并将它们转换为 numpy 数组。

Source: https://discuss.pytorch.org/t/time-to-transform-gpu-to-cpu-with-cpu/18856资料来源: https://discuss.pytorch.org/t/time-to-transform-gpu-to-cpu-with-cpu/18856

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM