My aim is to build an image classification model for flowers. The data RAR file consists of a folder named train data which consists of about 16000 images labelled from 0- 16000. Similarly there is a folder for test data also.
Apart from this there are two csv workbooks. The first csv workbook consists of two attributes - label & flower class. There are 104 labels and 104 flower classes. The second workbook also consists of two attributes - id & flower class. This data set corresponds to the train data and has same number of data points as the train data folder has (approx 16000)
For eg assume that image labelled 10 in train data folder is a sunflower. Hence in the (second) workbook the flower class entry corresponding to id =10 is a sunflower.
I have figured out that the first step is to store images of separate flower classes into separate directories. I have created 104 folders but I am struggling with renaming my image. Only after renaming I can move them into their respective directories.
The data is available here https://www.kaggle.com/ianmoone0617/flower-goggle-tpu-classification
dire = r'C:\Users\Ben\Desktop\Flower classification\flower_tpu\trial_2\\'
for i in range(0,7,1):
fl_name = flowers_idx['flower_cls'][flowers_idx['id'] == i].iloc[0]
for count, filename in enumerate(os.listdir(dire)):
dst = fl_name + ' ' + str(count) + ".JPEG"
src = dire + filename
dst = dire + dst
os.rename(src, dst)
This was my attempt to rename according to flower class name queried from the csv. But it renames all the flowers as the name of the last flower.
Welcome to this community. You don't need to reorganize the images into different folders. Read two CSV files using pandas
import pandas as pd
label_csv = pd.read_csv("flowers_label.csv")
flowers_csv = pd.read_csv("flowers_idx.csv")
Now you can loop through the the flowers_csv and load the images as numpy array. here is the code snipt.
from PIL import Image
X = [] #images
y = [] # labels
base_url = "flowers_google/"
row = 0;
for idx in range(len(flowers_csv)):
# get the flower row
flower = flowers_csv.iloc[idx]
# create flower path
path = f"{base}{flower.id}.jpeg"
#load image
img = Image.open(path)
# convert to numpy
img = np.array(img)
#save to X
X.append(img)
# get label
label = label_csv[label_csv['flower_class'] == flower.flower_cls].label.values[0]
# save to y
y.append(label)
You can also create your own custom Keras dataset class.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.