Pytorch - 如何为 DataLoader 手动规范化/转换数据

Question

I am following along with a LinkedInLearning tutorial for neural networks.我正在关注神经网络的 LinkedInLearning 教程。 I am trying to follow along using a different dataset than in the tutorial, but applying the same techniques to my own dataset.我正在尝试使用与本教程不同的数据集，但将相同的技术应用于我自己的数据集。 I am struggling with figuring out how to normalize/transform my data in the same way they do, because they are using some built in functionality that I do not know how to reproduce.我正在努力弄清楚如何以与他们相同的方式规范化/转换我的数据，因为他们正在使用一些我不知道如何重现的内置功能。

Here is an example of what they are doing:这是他们正在做的一个例子：

from torchvision import datasets, transforms

mean, std = (0.5,), (0.5,)

# Create a transform and normalise data
transform = transforms.Compose([transforms.ToTensor(),
                            transforms.Normalize(mean, std)
                          ])

# Download FMNIST training dataset and load training data
trainset = datasets.FashionMNIST('~/.pytorch/FMNIST/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

they are creating a transform, and then just passing it straight into this FashionMNIST method, which seems to be doing some sort of automatic transforming for the trainset .他们正在创建一个变换，然后直接将其传递给这个FashionMNIST方法，该方法似乎正在对trainset进行某种自动变换。

I want to do a similar thing, but for my dataset, there is no built in FashionMNIST method.我想做类似的事情，但是对于我的数据集，没有内置的FashionMNIST方法。 How would I replicate it?我将如何复制它？

Here's what I'm doing/know how to do:这是我正在做的/知道该怎么做：

import pandas as pd

df = pd.read_csv('../input/sign-language-mnist/sign_mnist_train.csv')
trainloader = torch.utils.data.DataLoader(df, batch_size = 64, shuffle = True)

How would I go about applying the same transform to my df without the help of this built in FashionMNIST method?如果没有内置的FashionMNIST方法的帮助，我将如何对我的df应用相同的transform ？

Answer 1

You need to build a custom Pytorch dataset to put into your dataloader您需要构建一个自定义 Pytorch 数据集以放入您的数据加载器

class MNistDataset:
    def __init__(self, df):
        self.df = self.custom_norm_function(df)
        self.device = T.device('cuda:0' if T.cuda.is_available() else 'cpu')
        
    def __len__(self):
        return len(self.df)
    
    def __getitem__(self, idx):
        image = self.df.loc[idx, 'image']
        label = self.df.loc[idx, 'label']
        return image, label

    def custom_norm_function(self, df):
        df = normalize(df)
        return df

Where you define your "custom_norm_function" as needed.根据需要定义“custom_norm_function”的位置。 Then put it in your dataloader.然后把它放在你的数据加载器中。

dataset = MNistDataset(df)
trainloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)

You can read more here -> https://pytorch.org/tutorials/beginner/data_loading_tutorial.html您可以在此处阅读更多信息-> https://pytorch.org/tutorials/beginner/data_loading_tutorial.html

Pytorch - 如何为 DataLoader 手动规范化/转换数据

问题描述

1 个解决方案

解决方案1
0 2022-05-28 03:54:16

Pytorch - 如何为 DataLoader 手动规范化/转换数据

问题描述

1 个解决方案

解决方案1 0 2022-05-28 03:54:16

解决方案1
0 2022-05-28 03:54:16