简体   繁体   English

Pytorch 张量到 numpy 数组

[英]Pytorch tensor to numpy array

I have a pytorch Tensor of size torch.Size([4, 3, 966, 1296])我有一个尺寸为torch.Size([4, 3, 966, 1296])pytorch张量

I want to convert it to numpy array using the following code:我想使用以下代码将其转换为numpy数组:

imgs = imgs.numpy()[:, ::-1, :, :]

Can anyone please explain what this code is doing ?谁能解释一下这段代码在做什么?

I believe you also have to use .detach() .我相信你也必须使用.detach() I had to convert my Tensor to a numpy array on Colab which uses CUDA and GPU.我不得不在使用 CUDA 和 GPU 的 Colab 上将我的 Tensor 转换为一个 numpy 数组。 I did it like the following:我是这样做的:

# this is just my embedding matrix which is a Torch tensor object
embedding = learn.model.u_weight

embedding_list = list(range(0, 64382))

input = torch.cuda.LongTensor(embedding_list)
tensor_array = embedding(input)
# the output of the line below is a numpy array
tensor_array.cpu().detach().numpy()

There are 4 dimensions of the tensor you want to convert.您要转换的张量有 4 个维度。

[:, ::-1, :, :] 

: means that the first dimension should be copied as it is and converted, same goes for the third and fourth dimension. :表示第一个维度应该按原样复制和转换,第三个和第四个维度也是如此。

::-1 means that for the second axes it reverses the the axes ::-1表示对于第二个轴,它反转轴

这对我有用:

np_arr = torch_tensor.cpu().detach().numpy()

While other answers perfectly explained the question I will add some real life examples converting tensors to numpy array:虽然其他答案完美地解释了这个问题,但我将添加一些将张量转换为 numpy 数组的真实示例:

Example: Shared storage示例:共享存储

PyTorch tensor residing on CPU shares the same storage as numpy array na驻留在 CPU 上的 PyTorch 张量与 numpy array na共享相同的存储

import torch
a = torch.ones((1,2))
print(a)
na = a.numpy()
na[0][0]=10
print(na)
print(a)

Output:输出:

tensor([[1., 1.]])
[[10.  1.]]
tensor([[10.,  1.]])

Example: Eliminate effect of shared storage, copy numpy array first示例:消除共享存储的影响,先复制numpy数组

To avoid the effect of shared storage we need to copy() the numpy array na to a new numpy array nac .为了避免共享存储的影响,我们需要将 numpy 数组na copy()到一个新的 numpy 数组nac Numpy copy() method creates the new separate storage. Numpy copy()方法创建新的单独存储。

import torch
a = torch.ones((1,2))
print(a)
na = a.numpy()
nac = na.copy()
nac[0][0]=10
​print(nac)
print(na)
print(a)

Output:输出:

tensor([[1., 1.]])
[[10.  1.]]
[[1. 1.]]
tensor([[1., 1.]])

Now, just the nac numpy array will be altered with the line nac[0][0]=10 , na and a will remain as is.现在,只用nac[0][0]=10行更改nac numpy 数组, naa将保持原样。

Example: CPU tensor with requires_grad=True示例:具有requires_grad=True CPU 张量

import torch
a = torch.ones((1,2), requires_grad=True)
print(a)
na = a.detach().numpy()
na[0][0]=10
print(na)
print(a)

Output:输出:

tensor([[1., 1.]], requires_grad=True)
[[10.  1.]]
tensor([[10.,  1.]], requires_grad=True)

In here we call:在这里我们调用:

na = a.numpy() 

This would cause: RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.这将导致: RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. , because tensors that require_grad=True are recorded by PyTorch AD. ,因为require_grad=True张量是由 PyTorch AD 记录的。 Note that tensor.detach() is the new way for tensor.data .需要注意的是tensor.detach()是新的方式tensor.data

This explains why we need to detach() them first before converting using numpy() .这解释了为什么我们需要在使用numpy()转换之前先detach()它们。

Example: CUDA tensor with requires_grad=False示例:带有requires_grad=False CUDA 张量

a = torch.ones((1,2), device='cuda')
print(a)
na = a.to('cpu').numpy()
na[0][0]=10
print(na)
print(a)

Output:输出:

tensor([[1., 1.]], device='cuda:0')
[[10.  1.]]
tensor([[1., 1.]], device='cuda:0')

Example: CUDA tensor with requires_grad=True示例:带有requires_grad=True CUDA 张量

a = torch.ones((1,2), device='cuda', requires_grad=True)
print(a)
na = a.detach().to('cpu').numpy()
na[0][0]=10
​print(na)
print(a)

Output:输出:

tensor([[1., 1.]], device='cuda:0', requires_grad=True)
[[10.  1.]]
tensor([[1., 1.]], device='cuda:0', requires_grad=True)

Without detach() method the error RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead.如果没有detach()方法, RuntimeError: Can't call出现错误RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead. on Tensor that requires grad. Use tensor.detach().numpy() instead. will be set.将被设置。

Without .to('cpu') method TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.没有.to('cpu')方法TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. will be set.将被设置。

You could use cpu() but instead of to('cpu') but I prefer the newer to('cpu') .您可以使用cpu()而不是to('cpu')但我更喜欢较新的to('cpu')

如果您的变量附加了一些 grads,您可以使用此语法。

y=torch.Tensor.cpu(x).detach().numpy()[:,:,:,-1]

Your question is very poorly worded.你的问题措辞非常糟糕。 Your code (sort of) already does what you want.你的代码(有点)已经做了你想要的。 What exactly are you confused about?你到底在纠结什么? x.numpy() answer the original title of your question: x.numpy()回答您问题的原始标题:

Pytorch tensor to numpy array Pytorch 张量到 numpy 数组

you need improve your question starting with your title.你需要从你的标题开始改进你的问题。

Anyway, just in case this is useful to others.无论如何,以防万一这对其他人有用。 You might need to call detach for your code to work.您可能需要调用 detach 才能使您的代码工作。 eg例如

RuntimeError: Can't call numpy() on Variable that requires grad.

So call .detach() .所以调用.detach() Sample code:示例代码:

# creating data and running through a nn and saving it

import torch
import torch.nn as nn

from pathlib import Path
from collections import OrderedDict

import numpy as np

path = Path('~/data/tmp/').expanduser()
path.mkdir(parents=True, exist_ok=True)

num_samples = 3
Din, Dout = 1, 1
lb, ub = -1, 1

x = torch.torch.distributions.Uniform(low=lb, high=ub).sample((num_samples, Din))

f = nn.Sequential(OrderedDict([
    ('f1', nn.Linear(Din,Dout)),
    ('out', nn.SELU())
]))
y = f(x)

# save data
y.numpy()
x_np, y_np = x.detach().cpu().numpy(), y.detach().cpu().numpy()
np.savez(path / 'db', x=x_np, y=y_np)

print(x_np)

cpu goes after detach. cpu 在分离之后去。 See: https://discuss.pytorch.org/t/should-it-really-be-necessary-to-do-var-detach-cpu-numpy/35489/5参见: https : //discuss.pytorch.org/t/should-it-really-be-necessary-to-do-var-detach-cpu-numpy/35489/5


Also I won't make any comments on the slicking since that is off topic and that should not be the focus of your question.此外,我不会对浮夸发表任何评论,因为那是题外话,这不应该是您问题的重点。 See this:看到这个:

Understanding slice notation 理解切片符号

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM