简体   繁体   English

加载 csv 和 pytorch 中的图像数据集

[英]Load csv and Image dataset in pytorch

I am doing image classification with PyTorch. I have a separate Images folder and train and test csv file with images ids and labels.我正在使用 PyTorch 进行图像分类。我有一个单独的图像文件夹,并训练和测试带有图像 ID 和标签的 csv 文件。 I don't have any an idea about how to combine those images and ID and converting into tensors.我不知道如何组合这些图像和 ID 并转换为张量。

  1. train.csv: contains all ID of Image like 4325.jpg, 2345.jpg,…so on and contains Labels like cat,dog. train.csv:包含图像的所有 ID,如 4325.jpg、2345.jpg 等,并包含标签,如猫、狗。
  2. Image_data: contains all the images of with ID name. Image_data:包含ID名称的所有图片。

You can create custom dataset class by inherting pytorch's torch.utils.data.Dataset .您可以通过继承 pytorch 的 torch.utils.data.Dataset 创建自定义数据集class

The assumption for the following custom dataset class is以下自定义数据集 class 的假设是

  • csv file format is csv 文件格式为

filename文件名 label label
4325.jpg 4325.jpg cat
2345.jpg 2345.jpg dog
  • All images are inside images folder .所有图像都在images folder中。
class CustomDataset(torch.utils.data.Dataset):
    def __init__(self, csv_path, images_folder, transform = None):
        self.df = pd.read_csv(csv_path)
        self.images_folder = images_folder
        self.transform = transform
        self.class2index = {"cat":0, "dog":1}

    def __len__(self):
        return len(self.df)
    def __getitem__(self, index):
        filename = self.df[index, "FILENAME"]
        label = self.class2index[self.df[index, "LABEL"]]
        image = PIL.Image.open(os.path.join(self.images_folder, filename))
        if self.transform is not None:
            image = self.transform(image)
        return image, label
        

Now you can use this class to load the training and test dataset using both csv file and image folder.现在您可以使用这个 class 加载训练和测试数据集,同时使用 csv 文件和图像文件夹。


train_dataset = CustomDataset("path - to - train.csv", "path - to - images - folder"  )
test_dataset = CustomDataset("path - to - test.csv", "path - to - images - folder"  )


image, label = train_dataset[0]

I am doing image classification with PyTorch.我正在使用 PyTorch 进行图像分类。 I have a separate Images folder and train and test csv file with images ids and labels.我有一个单独的图像文件夹,并使用图像 ID 和标签训练和测试 csv 文件。 I don't have any an idea about how to combine those images and ID and converting into tensors.我对如何将这些图像和 ID 组合并转换为张量一无所知。

  1. train.csv: contains all ID of Image like 4325.jpg, 2345.jpg,…so on and contains Labels like cat,dog. train.csv:包含图像的所有ID,如4325.jpg,2345.jpg,...等,并包含猫,狗等标签。
  2. Image_data: contains all the images of with ID name. Image_data:包含ID名称的所有图像。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM