简体   繁体   English

Pytorch 归一化二维张量

[英]Pytorch normalize 2D tensor

For more robustnes of my model I want to normalize my feature tensor.为了让我的 model 更加稳健,我想规范化我的特征张量。

I tried doing it the way that is to the best of my knowledge standard for pictures:我尝试按照我对图片的知识标准的最佳方式进行操作:

class Dataset(torch.utils.data.Dataset):
'Characterizes a dataset for PyTorch'
def __init__(self, input_tensor, transform = transforms.Normalize(mean= 0.5, std=0.5)):
    self.labels = input_tensor[:,:,-1]
    self.features = input_tensor[:,:,:-1]
    self.transform = transform

def __len__(self):
    return self.labels_planned.shape[0]

def __getitem__(self, index):

    # Load data and get label
    X = self.features[index]
    y = self.labelslabels[index]
    if self.transform:
        X = self.transform(X)   
    return X, y

But receive this error message:但收到此错误消息:

ValueError: Expected tensor to be a tensor image of size (C, H, W). Got tensor.size() = torch.Size([8, 25]).

Everywhere I looked people suggest that one should use.view to generate the third dimension in order to comply with the standard shape of pictures, but this seems very odd to me.我到处都看到人们建议应该使用.view 来生成第三维以符合图片的标准形状,但这对我来说似乎很奇怪。 Is there maybe a cleaner way to do this.是否有更清洁的方法来做到这一点。 Also where should I best place the normalization?另外,我应该在哪里最好地进行标准化? Just for the batch or for the entire train dataset?只针对批次还是针对整个火车数据集?

You are asking two different questions, I will try to answer both.你问的是两个不同的问题,我会尽量回答。

  • Indeed, you should first reshape to (c, h, w) where c is the channel dimension In most cases, you will need that extra dimension because most 'image' layers are built to receive 3d dimensional tensors - not counting the batch dimension - such as nn.Conv2d , BatchNorm2d , etc... I don't believe there's anyways around it, and doing so would restrict yourself to one-layer image datasets.实际上,您应该首先重塑到(c, h, w)其中c是通道维度在大多数情况下,您将需要额外的维度,因为大多数“图像”层都是为接收 3d 维度张量而构建的 - 不包括批量维度 -例如nn.Conv2dBatchNorm2d等......我不相信它周围有任何东西,这样做会将自己限制在一层图像数据集上。

    You can broadcast to the desired shape with torch.reshape or Tensor.view :您可以使用torch.reshapeTensor.view广播到所需的形状:

     X = X.reshape(1, *X.shape)

    Or by adding an additional dimension using torch.unsqueeeze :或者通过使用torch.unsqueeeze添加一个额外的维度:

     X.unsqueeze(0)
  • About normalization.关于标准化。 Batch-normalization and dataset-normalization are two different approaches.批量归一化数据集归一化是两种不同的方法。

    The former is a technique that can achieve improved performance in convolution networks.前者一种可以在卷积网络中提高性能的技术。 This kind of operation can be implemented using a nn.BatchNorm2d layer and is done using learnable parameters: a scale factor (~ std) and a bias (~ mean).这种操作可以使用nn.BatchNorm2d层来实现,并使用可学习的参数完成:比例因子(~std)和偏差(~mean)。 This type of normalization is applied when the model is called and is applied per-batch.当调用 model 并按批次应用时,将应用这种类型的规范化。

    The latter is a pre-processing technique which allows making different features have the same scale.后者是一种预处理技术,允许使不同的特征具有相同的比例。 This normalization can be applied inside the dataset per-element.这种标准化可以应用于每个元素的数据集中。 It requires you measure the mean and standard deviation of your training set.它要求您测量训练集的均值和标准差。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM