简体   繁体   中英

Image classification model

My aim is to build an image classification model for flowers. The data RAR file consists of a folder named train data which consists of about 16000 images labelled from 0- 16000. Similarly there is a folder for test data also.

Apart from this there are two csv workbooks. The first csv workbook consists of two attributes - label & flower class. There are 104 labels and 104 flower classes. The second workbook also consists of two attributes - id & flower class. This data set corresponds to the train data and has same number of data points as the train data folder has (approx 16000)

For eg assume that image labelled 10 in train data folder is a sunflower. Hence in the (second) workbook the flower class entry corresponding to id =10 is a sunflower.

I have figured out that the first step is to store images of separate flower classes into separate directories. I have created 104 folders but I am struggling with renaming my image. Only after renaming I can move them into their respective directories.

The data is available here https://www.kaggle.com/ianmoone0617/flower-goggle-tpu-classification

dire = r'C:\Users\Ben\Desktop\Flower classification\flower_tpu\trial_2\\'

for i in range(0,7,1):
    fl_name = flowers_idx['flower_cls'][flowers_idx['id'] == i].iloc[0]
    for count, filename in enumerate(os.listdir(dire)):
        dst = fl_name + ' ' + str(count) + ".JPEG"
        src = dire + filename 
        dst = dire + dst
        os.rename(src, dst)

This was my attempt to rename according to flower class name queried from the csv. But it renames all the flowers as the name of the last flower.

Welcome to this community. You don't need to reorganize the images into different folders. Read two CSV files using pandas

import pandas as pd

label_csv = pd.read_csv("flowers_label.csv")
flowers_csv = pd.read_csv("flowers_idx.csv")

Now you can loop through the the flowers_csv and load the images as numpy array. here is the code snipt.

from PIL import Image

X = [] #images
y = [] # labels

base_url = "flowers_google/"

row = 0;
for idx in range(len(flowers_csv)):
  # get the flower row
  flower = flowers_csv.iloc[idx]
  # create flower path
  path = f"{base}{flower.id}.jpeg"
  #load image
  img = Image.open(path)
  # convert to numpy
  img = np.array(img)
  #save to X
  X.append(img)

  # get label
  label = label_csv[label_csv['flower_class'] == flower.flower_cls].label.values[0]
  # save to y
  y.append(label)

You can also create your own custom Keras dataset class.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM