简体   繁体   中英

How to create a for loop to extract specific images from a folder to another folder?

I have this table in a csv file called sample_labels.csv In the table there is image index and class labels as such-

Image Index labels
00000013_005.png Emphysima
00000013_026.png Emphysima
00000017_001.png No finding
00000042_002.png No finding
00000084_000.png Effusion
00000099_003.png Effusion

I have another folder with the images in them. The folder is called "train_images"

How can i create a for loop that with create folders called "Emphysima", "No finding", "Effusion" and store those images with the corresponding label in the corresponding folder?

I mean the two images with Emphysima label in "Emphysima" folder and so on.

If I understand your request, you can do this one of a way as follows:

Data

import pandas as pd 

d = {
    'Image Index': 
        ['00000013_005.jpeg', '00000013_026.jpeg', 
         '00000017_001.jpeg', '00000042_002.jpg', 
         '00000084_000.jpg', '00000099_003.jpg'],
    
    'labels': 
        ['Emphysima', 'Emphysima', 
         'No finding', 'No finding', 
         'Effusion', 'Effusion']}

df = pd.DataFrame(data=d)
df.head()

         Image Index    labels
0   00000013_005.jpeg   Emphysima
1   00000013_026.jpeg   Emphysima
2   00000017_001.jpeg   No finding
3   00000042_002.jpg    No finding
4   00000084_000.jpg    Effusion

Now, according to your need, you can try this

from pathlib import Path
from PIL import Image

# iterate over the unique label 
for item_name in df.labels.unique(): 
    
    # create folder according to the label name 
    item_folder = Path(f"{item_name}/")
    item_folder.mkdir(parents=True, exist_ok=True)
    
    # store id and gt for unique labels
    id = []
    gt = []
    
    # iterate over all possible number of unique labels 
    for id_label in df.loc[df['labels'] == item_name].values.tolist():
        # id_label :['image_id', 'label']
        id.append(id_label[0]) # image_id
        gt.append(id_label[1]) # label
        
        img = Image.open(id_label[0]) # read the image 
        img.save(f'{item_folder}/{id_label[0]}') # and save to target folder 
    
    # save the individual ground truth 
    # to the concern directory 
    label = pd.DataFrame({ 
            'Image Index': id,
            'labels': gt
        })
    label.to_csv(f'{item_folder}/{item_folder}.csv', index=False)

It will create a directory by the labels name and save the corresponding image to this directory and create a new data frame only with their labels.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM