Data preprocessing for custom dataset in pytorch (transform.Normalize)

Question

I am new to Pytorch and CNN. I am kind of confused about Data Preprocessing. Not sure how to go about transform.Normalising the dataset (in essence how do you calculate mean and std v for your custom dataset ?)

I am loading my data using ImageFolder. The images are of different sizes.

train_transforms = transforms.Compose([transforms.Resize(size=224),
                                       transforms.ToTensor(),  transforms.Normalize((?), (?))
                                       ])
train_dataset = datasets.ImageFolder(root='roota/',
                                     transform=train_transforms)

Answer 1

If you're planning to train your network from scratch, you can calculate your dataset's statistics. The statistics of the dataset are calculated beforehand. You can use the ImageFolder to loop through the images to calculate the dataset statistics. For example, pseudo code -

for inputs, labels in dataloaders:
    # Calculate mean and std dev 
    # save for later processing

Typically, CNNs are pretrained with other larger datasets, such as Imagenet, primarily to reduce the training time. If you are using a pretrained network, you can use the mean and std dev of the original dataset for your training.

Data preprocessing for custom dataset in pytorch (transform.Normalize)

Question

1 answers

solution1
3 ACCPTED 2018-09-25 03:58:11

Data preprocessing for custom dataset in pytorch (transform.Normalize)

Question

1 answers

solution1 3 ACCPTED 2018-09-25 03:58:11

solution1
3 ACCPTED 2018-09-25 03:58:11