[英]Crop multi image from folder if name image matches image_id in csv file in Python
I have got a list of about 300 image_id and bounding box position in a csv file.我在 csv 文件中有大约 300 个 image_id 和边界框 position 的列表。 I also have a folder of about 300 images with each image id matching the name of each image.我还有一个包含大约 300 张图像的文件夹,每个图像 id 都与每个图像的名称相匹配。 How do I compare the name of the image and the image_id if it matches me, I will crop it.我如何比较图像的名称和 image_id 如果它匹配我,我会裁剪它。
I use the python language and ubuntu os.我使用 python 语言和 ubuntu 操作系统。
import os, pandas
data = pandas.read_csv(your_csv_file) #read csv file
# Get the directory of images
path = "path folder"
#Edit2 you may have to add dtype str, as pandas will assume int if you only have integers
dirs = os.listdir( path, dtype=str ) #get all files in folder
# Get all the files and split at '.' to get the names
listoffiles = []
for file in dirs:
basename = os.path.splitext(file)[0] #this will get you the basename
listoffiles.append(basename) #you will have a list of all filenames
matches = data[data['image_id'].isin(listoffiles)] #now in matches you have a table containing only rows that correspond to filenames
print(matches.head())
Hope this helps.希望这可以帮助。
Edit: you can later iterate matches to actually do the cropping:编辑:您可以稍后迭代匹配以实际进行裁剪:
for index, row in matches.iterrows():
print(row['image_id'], row['bounding_box'])
# do cropping here
In my opinion you could build upon the json.loads
method在我看来,你可以建立在json.loads
方法
In [23]: from json import loads
...:
...: data = '''\
...: 1693884003 {'right': 0.6428571428571429, 'bottom': 0.9761904761904762, 'top': 0.38095238095238093, 'left': 0.22857142857142856}
...: 1693884030 {'right': 0.6571428571428571, 'bottom': 0.9285714285714286, 'top': 0.38095238095238093, 'left': 0.3142857142857143}
...: 1735837028 {'right': 0.68, 'bottom': 0.9, 'top': 0.4, 'left': 0.34}
...: 1740301012 {'right': 0.6142857142857143, 'bottom': 0.9523809523809523, 'top': 0.38095238095238093, 'left': 0.35714285714285715}
...: 1779624112 {'right': 0.7142857142857143, 'bottom': 0.9047619047619048, 'top': 0.5357142857142857, 'left': 0.21428571428571427}\
...: '''
...: images = {}
...: for line in data.splitlines():
...: image, bounds = line.split(' ', 1)
...: images[image] = loads(bounds.replace("'", '"'))
...: from pprint import pprint
...: pprint(images)
{'1693884003': {'bottom': 0.9761904761904762,
'left': 0.22857142857142856,
'right': 0.6428571428571429,
'top': 0.38095238095238093},
'1693884030': {'bottom': 0.9285714285714286,
'left': 0.3142857142857143,
'right': 0.6571428571428571,
'top': 0.38095238095238093},
'1735837028': {'bottom': 0.9, 'left': 0.34, 'right': 0.68, 'top': 0.4},
'1740301012': {'bottom': 0.9523809523809523,
'left': 0.35714285714285715,
'right': 0.6142857142857143,
'top': 0.38095238095238093},
'1779624112': {'bottom': 0.9047619047619048,
'left': 0.21428571428571427,
'right': 0.7142857142857143,
'top': 0.5357142857142857}}
In [24]:
Note that I read from a string while you will be reading from an open file,请注意,我从字符串中读取,而您将从打开的文件中读取,
note also that json.loads
expects double quotes only as a delimiter, so we have to replace
the single quotes in your data with double quotes before using json.loads
.另请注意, json.loads
仅需要双引号作为分隔符,因此在使用json.loads
之前,我们必须将数据中的单引号replace
为双引号。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.