简体   繁体   中英

How to save augmented images using ImageDataGenerator and flow_from_directory in keras

I want to augment images that are in two different directories (folders: benign/malignant) using ImageDataGenerator in Keras.

Then I want to save the augmented images of each class in a separate folder.

My directory structure is like as follows:

dataset
|   
|-- original_images
|   |                        
|   |-- benign                  
|   |    |-- benign_image1.png
|   |    |-- benign_image2.png
|   |    |-- ...
|   |
|   |-- malignant                   
|        |-- malignant_image1.png
|        |-- malignant_image2.png
|        |-- ...  
|   
|-- augmented_images
    |                        
    |-- augmented_benign                <-- Here i want to save augmented images of benign folder   
    |    |-- augmented_img1.png
    |    |-- augmented_img2.png
    |    |-- ...
    |
    |-- augmented_malignant             <-- Here i want to save augmented images of malignant folder
         |-- augmented_img1.png
         |-- augmented_img2.png
         |-- ...  

My problem is that I can not distinguish the augmented images of these two classes from each other since all of them are going to be stored in the same folder.

Actually, I can only set a single folder path to "save_to_dir" parameter in order to store images there.

So as I mentioned all the augmented images will be saved in one folder ( augmented_images ).

Could you guys tell me how I can save the augmented images of each class in a separate folder ( augmented_benign and augmented_malignant )?

The code I wrote is something like this:

from keras.preprocessing.image import ImageDataGenerator

img_dir_path = "D:/dataset/original_images"
save_dir_path = "D:/dataset/augmented_images"

datagen = ImageDataGenerator(rotation_range=90)

data_generator = datagen.flow_from_directory(
    img_dir_path, 
    target_size=(128, 128), 
    color_mode="rgb", 
    batch_size=20, 
    save_to_dir="save_dir_path", 
    class_mode="binary", 
    save_prefix="augmented", 
    save_format="png")

for i in range(10):
    data_generator.next()

The code below should give you what you want. I tested it and it does the job. One thing I did notice is when you set rotation_range=90, the images are seem to be randomly rotated between -90 degrees to + 90 degrees. That was kind of a surprise.

sdir=r'c:\temp\dataset'
aug_dir=os.path.join(sdir,'augmented_images')
if os.path.isdir(aug_dir): # see if aug_dir exists if so remove it to get a clean slate
    shutil.rmtree(aug_dir)
os.mkdir(aug_dir) # make a new empty aug_dir
filepaths=[]
labels=[]
# iterate through original_images and create a dataframe of the form filepaths, labels
original_images_dir=os.path.join(sdir, 'original_images')
for klass in ['benign', 'malignant']:
    os.mkdir(os.path.join(aug_dir,klass)) # make the class subdirectories in the aug_dir
    classpath=os.path.join(original_images_dir, klass) # get the path to the classes (benign and maligant)
    flist=os.listdir(classpath)# for each class the the list of files in the class    
    for f in flist:        
        fpath=os.path.join(classpath, f) # get the path to the file
        filepaths.append(fpath)
        labels.append(klass)
    Fseries=pd.Series(filepaths, name='filepaths')
    Lseries=pd.Series(labels, name='labels')
df=pd.concat([Fseries, Lseries], axis=1) # create the dataframe
gen=ImageDataGenerator( rotation_range=90)
groups=df.groupby('labels') # group by class
for label in df['labels'].unique():  # for every class               
    group=groups.get_group(label)  # a dataframe holding only rows with the specified label 
    sample_count=len(group)   # determine how many samples there are in this class  
    aug_img_count=0
    target_dir=os.path.join(aug_dir, label)  # define where to write the images    
    aug_gen=gen.flow_from_dataframe( group,  x_col='filepaths', y_col=None, target_size=(128,128), class_mode=None,
                                        batch_size=1, shuffle=False, save_to_dir=target_dir, save_prefix='aug-',
                                        save_format='jpg')
    while aug_img_count<len(group):
        images=next(aug_gen)            
        aug_img_count += len(images) 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM