Pytorch Dataloader shuffle with multiple dataset

Question

I'm trying to make custom Dataloader with multiple datasets.

My question is that if I use (shuffle = True) in the Dataloader option, is it possible to shuffle the same order in multiple Dataloader ?

For example:

dataloader1: label = [5, 4, 15, 16]

dataloader2: label = [5, 4, 15, 16]

Answer 1

Edit: Pytorch's dataloaders have an implemented solution for that already.

See here: https://pytorch.org/docs/stable/data.html#torch.utils.data.Sampler you can specify the sampler yourself. So you can create a generator and give it to all dataloaders.

Old (and a bit more hacky) answer:

If keeping the order is really important, instead of making a custom dataloader, it may be better to make a custom dataset.

Note that is only possible if all datasets have the same number of examples. Or not use part of the data of the larger datasets.

Something in those lines should work:

class ManyDatasetsInOne(Dataset):
    def __init__(self, **parameters):
        self.dataset1 = dataset1(**parameters_1)
        self.dataset2 = dataset2(**parameters_2)

    def __len__(self):
        return len(self.dataset1)

    def __getitem__(self, index):

        data1 = load_item(idx, self.dataset1)
        data2 = load_item(idx, self.dataset1)

        return data1, data2

Pytorch Dataloader shuffle with multiple dataset

Question

1 answers

solution1
0 ACCPTED 2020-04-04 13:02:28

Pytorch Dataloader shuffle with multiple dataset

Question

1 answers

solution1 0 ACCPTED 2020-04-04 13:02:28

solution1
0 ACCPTED 2020-04-04 13:02:28