简体   繁体   中英

What does ImageDataGenerator returns, and how to do data augmentation?

I am trying to fit a CNN model (AlexNet architecture) with a 4900 images(480*640*3) dataset and I would like to do Data Augmentation, I have done a custom generator which use ImageDataGenerator method, because the images are on different paths and the labels too, so I have done a class who take all paths and save on two lists the images paths and its labels, then it loads on batches of 32 images and labels and fit the image data generator:

This is the method of the custom generator called from the model when it´s fit, and is where I fit the ImageDataGenerator

def __getitem__(self,index) :
      batch_x=self.img_filenames[index * self.batch_size : (index+1) * self.batch_size]
      batch_y=self.labels[index * self.batch_size: (index+1) * self.batch_size] 
      gen=ImageDataGenerator(rescale=1./255,
                         rotation_range=90,
                         brightness_range=(0.1,0.9),
                         horizontal_flip=True)
      X=[plt.imread(filename) for filename in batch_x]
      X,Y = next(gen.flow(x= np.array(X), y= np.array(batch_y), batch_size=self.batch_size))
      return X,Y

I have some questions:

What is supposed that ImageDataGenerator returns, if I pass 32(batch_size) differents images, it returns 32 modified images, 1 for each one, or 32 images for each one, and if I only pass 1 image with a batch size of 32, it returns 32 modified images from that one? I'm almost sure that are 1 for each one but I want to confirm.

Secondly, if I want to have 40k images, if I change the index to 0 again when it exceed samples//batch_size,and change the len method multiplying by 2 or whatever I want, it is supposed that as the images are generated randomly, I will have 4900 new images or as much as I want isn´t it?

The main problem is that when it reach 0.5 accuracy it stops increasing, I have tried with 3 epochs and it is the same, it increase till 3 or 4 batches and then stops, so that is why my doubts.

Thanks you.

Let me try to answer 1. If you pass batch size 32 to ImageDataGenerator with horizontal_flip=True only, it flip all of 32 images horizontally and passes these 32 +32 (original + flipped) for training.
If you set horizontal_flip and vertical_flip , then 32+32+32 images will be passed for training. For brightness_range it produces one image for each brightness scale corresponding to one original image. It means if your brightness scale is 0.1-0.5 , then 32*5 images were produced.

I am not sure about the second question. A better choice is to do more data augmentation both on training and test data.

For third question, you should try efficient net with focal loss

I printed X.shape and seems to be the 32 images but modified, so It doesn´t multiply the images. And the method to augment the data which I said works fine too.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM