简体   繁体   中英

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first

The following is from a project that I'm doing in Udacity on Deep Learning. The project is on Generating TV scripts. The error that i encountered is the one below. The following function is the one after model training.

def generate(rnn, prime_id, int_to_vocab, token_dict, pad_value, predict_len=100):
    """
    Generate text using the neural network
    param decoder: The PyTorch Module that holds the trained neural network
    param prime_id: The word id to start the first prediction
    param int_to_vocab: Dict of word id keys to word values
    param token_dict: Dict of puncuation tokens keys to puncuation values
    param pad_value: The value used to pad a sequence
    param predict_len: The length of text to generate
    return: The generated text
    """
    rnn.eval()

    # create a sequence (batch_size=1) with the prime_id
    current_seq = np.full((1, sequence_length), pad_value)
    current_seq[-1][-1] = prime_id
    predicted = [int_to_vocab[prime_id]]

    for _ in range(predict_len):
        if train_on_gpu:
            current_seq = torch.LongTensor(current_seq).cuda()
        else:
            current_seq = torch.LongTensor(current_seq)

        # initialize the hidden state
        hidden = rnn.init_hidden(current_seq.size(0))

        # get the output of the rnn
        output, _ = rnn(current_seq, hidden)

        # get the next word probabilities
        p = F.softmax(output, dim=1).data
        if(train_on_gpu):
            p = p.cpu() # move to cpu

        # use top_k sampling to get the index of the next word
        top_k = 5
        p, top_i = p.topk(top_k)
        top_i = top_i.numpy().squeeze()

        # select the likely next word index with some element of randomness
        p = p.numpy().squeeze()
        word_i = np.random.choice(top_i, p=p/p.sum())

        # retrieve that word from the dictionary
        word = int_to_vocab[word_i]
        predicted.append(word)     

        # the generated word becomes the next "current sequence" and the cycle can continue
        current_seq = np.roll(current_seq, -1, 1)
        current_seq[-1][-1] = word_i

    gen_sentences = ' '.join(predicted)

    # Replace punctuation tokens
    for key, token in token_dict.items():
        ending = ' ' if key in ['\n', '(', '"'] else ''
        gen_sentences = gen_sentences.replace(' ' + token.lower(), key)
    gen_sentences = gen_sentences.replace('\n ', '\n')
    gen_sentences = gen_sentences.replace('( ', '(')

    # return all the sentences
    return gen_sentences

after this the following code is run:

# run the cell multiple times to get different results!
gen_length = 400 # modify the length to your preference
prime_word = 'jerry' # name for starting the script

pad_word = helper.SPECIAL_WORDS['PADDING']
generated_script = generate(trained_rnn, vocab_to_int[prime_word + ':'], int_to_vocab, token_dict, vocab_to_int[pad_word], gen_length)
print(generated_script)

Upon running this code, I get the following error

TypeError                                 Traceback (most recent call last)
<ipython-input-40-68a17c4d1704> in <module>()
      7 """
      8 pad_word = helper.SPECIAL_WORDS['PADDING']
----> 9 generated_script = generate(trained_rnn, vocab_to_int[prime_word + ':'], int_to_vocab, token_dict, vocab_to_int[pad_word], gen_length)
     10 print(generated_script)

3 frames
<ipython-input-39-b86c7a305356> in generate(rnn, prime_id, int_to_vocab, token_dict, pad_value, predict_len)
     53 
     54         # the generated word becomes the next "current sequence" and the cycle can continue
---> 55         current_seq = np.roll(current_seq, -1, 1)
     56         current_seq[-1][-1] = word_i
     57 

<__array_function__ internals> in roll(*args, **kwargs)

/usr/local/lib/python3.6/dist-packages/numpy/core/numeric.py in roll(a, shift, axis)
   1179 
   1180     """
-> 1181     a = asanyarray(a)
   1182     if axis is None:
   1183         return roll(a.ravel(), shift, 0).reshape(a.shape)

/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asanyarray(a, dtype, order)
    136 
    137     """
--> 138     return array(a, dtype, copy=False, order=order, subok=True)
    139 
    140 

/usr/local/lib/python3.6/dist-packages/torch/tensor.py in __array__(self, dtype)
    490     def __array__(self, dtype=None):
    491         if dtype is None:
--> 492             return self.numpy()
    493         else:
    494             return self.numpy().astype(dtype, copy=False)

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Can anyone please help me out?

np.roll(current_seq, -1, 1) requires the input to be a NumPy array, but current_seq is a tensor, so it tries to convert it to a NumPy array, which fails, because the tensor is on the GPU. In order to convert it to a NumPy array, you need to have the tensor on the CPU.

current_seq = np.roll(current_seq.cpu(), -1, 1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM