在 GPU 上運行 BERT SQUAD model

Question

我正在使用 BERT Squad model 對一組文檔 (>20,000) 提出相同的問題。 model 目前在我的 CPU 上運行，處理一個文檔大約需要一分鍾 - 這意味着我需要幾天時間才能完成該程序。

我想知道是否可以通過在 GPU 上運行 model 來加快速度。 但是，我是 GPU 新手，我不知道如何將這些輸入和 model 發送到設備 (Titan xp)。

該代碼是從 Chris McChormick 那里借來的。

import torch
import tensorflow as tf
from transformers import BertForQuestionAnswering
from transformers import BertTokenizer

model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')

'question' 和 'answer_text' 分別是問題和上下文字符串。

    input_ids = tokenizer.encode(question, answer_text)

    # ======== Set Segment IDs ========
    # Search the input_ids for the first instance of the `[SEP]` token.
    sep_index = input_ids.index(tokenizer.sep_token_id)

    if len(input_ids)>512:
        input_ids=input_ids[:512]
  
    num_seg_a = sep_index + 1
    num_seg_b = len(input_ids) - num_seg_a

    # Construct the list of 0s and 1s.
    segment_ids = [0]*num_seg_a + [1]*num_seg_b

    # There should be a segment_id for every input token.
    assert len(segment_ids) == len(input_ids)

    # ======== Evaluate ========
    # Run our example through the model.
    outputs = model(torch.tensor([input_ids]), # The tokens representing our input text.
                    token_type_ids=torch.tensor([segment_ids]), # The segment IDs to differentiate question from answer_text
                    return_dict=True) 

    start_scores = outputs.start_logits
    end_scores = outputs.end_logits

我知道我可以使用 model.tocuda() 將 model 發送到 GPU。 但是我如何發送輸入，訓練 model，並從 GPU 中檢索 output？

Answer 1

已經有一段時間了，但無論如何我都會回答，希望它可能會對某人有所幫助。 您可以使用to方法將每個張量復制到 GPU。 例如，您的批次包含 4 個 pytorch 張量：輸入 ID、注意掩碼、段 ID 和標簽

device = torch.device("cuda")
b_input_ids = batch[0].to(device)
b_input_mask = batch[1].to(device)
b_seg_ids = batch[2].to(device)
b_labels = batch[2].to(device)

然后，您可以使用.cpu()將日志和標簽從 gpu 傳輸回 cpu。 例如；

start_logits = start_logits.detach().cpu()
end_logits = end_logits.detach().cpu()

或與（設備）類似，您可以使用

start_logits = start_logits.to('cpu')
end_logits = end_logits.to('cpu')

請注意：由於您將在 model 中使用它們，您可能需要在末尾添加.numpy() 並將它們轉換為 numpy 數組。

資料來源： https://discuss.pytorch.org/t/time-to-transform-gpu-to-cpu-with-cpu/18856

在 GPU 上運行 BERT SQUAD model

問題描述

1 個解決方案

解決方案1
1 2021-09-10 07:16:45

在 GPU 上運行 BERT SQUAD model

問題描述

1 個解決方案

解決方案1 1 2021-09-10 07:16:45

解決方案1
1 2021-09-10 07:16:45