简体   繁体   English

如何使用panda加载具有文件夹名称和图像名称但不包含ID的数据集文件?

[英]How do I load a dataset file that has folder name and image name but does not contain an id in python using panda?

The file I am using is a text file and is in this format (below). 我使用的文件是一个文本文件,格式如下(如下所示)。 The first column represents the folder name. 第一列代表文件夹名称。 Here is a sample. 这是一个样本。

0010\\0010_01_05_03_115.jpg 0010 \\ 0010_01_05_03_115.jpg
0010\\0010_01_05_03_121.jpg 0010 \\ 0010_01_05_03_121.jpg
0010\\0010_01_05_03_125.jpg 0010 \\ 0010_01_05_03_125.jpg

How can I load it in into my program because I get this error: 由于出现此错误,如何将其加载到程序中:

img=image.load_img('TrainImages/' +TrainImages['id'][i].astype('str')+'.png', target_size=(2, 8, 28, 1),grayscale=False) File "C:\\Anaconda\\lib\\site-packages\\pandas\\core\\frame.py", line 2927, in getitem indexer = self.columns.get_loc(key) File "C:\\Anaconda\\lib\\site-packages\\pandas\\core\\indexes\\base.py", line 2659, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'id' img = image.load_img('TrainImages /'+ TrainImages ['id'] [i] .astype('str')+'。png',target_size =(2,8,28,1),grayscale = False)文件getitem索引器中的“ C:\\ Anaconda \\ lib \\ site-packages \\ pandas \\ core \\ frame.py”行2927 = = self.columns.get_loc(key)文件“ C:\\ Anaconda \\ lib \\ site-packages \\ pandas get_loc中的\\ core \\ indexes \\ base.py“行2659,返回self._engine.get_loc(self._maybe_cast_indexer(key))文件” pandas / _libs / index.pyx“,行108,位于pandas._libs.index中。 pandas._libs.index.IndexEngine.get_loc文件“ pandas / _libs / hashtable_class_helper.pxi”中的IndexEngine.get_loc文件“ pandas / _libs / index.pyx”,第132行,pandas._libs.hashtable.PyObjectHashTable中的行“ 1601”。 get_item文件“ pandas / _libs / hashtable_class_helper.pxi”,行1608,在pandas._libs.hashtable.PyObjectHashTable.get_item中KeyError:'id'

I am actually trying to create a training data set by reading in a file and applying some preprocessing to it before doing the rest. 我实际上正在尝试通过读取文件并在进行其余操作之前对其进行一些预处理来创建训练数据集。

This is the code I tried and I am not sure if it is correct : 这是我尝试的代码,我不确定是否正确:

TrainImages=pd.read_csv('client_train_raw.txt')
train_image =[]
for i in tqdm(range(TrainImages.shape[0])):
    img=image.load_img('TrainImages/' +TrainImages['id'] 
      [i].astype('str')+'.png', target_size=(2, 8, 28, 1),grayscale=False)
    img = image.img_to_array(img)

You haven't told your dataframe what 'id' means. 您尚未告诉数据框'id'是什么意思。 It looks like your data file only has one column, the file path separated by '\\' . 看来您的数据文件只有一列,文件路径以'\\'分隔。 You should be able to fix this with: 您应该可以通过以下方法解决此问题:

train_images = pd.read_csv('client_train_raw.txt', header=False, names=['id'])

This will label the single column in your dataframe as 'id' and you'll stop getting that error. 这会将数据框中的单列标记为'id' ,您将停止获取错误。 I think there are still going to be some issues with how you are handling file paths, and I'm not sure that the [i] in TrainImages['id'][i].astype('str') is doing what you think it is. 我认为您如何处理文件路径仍然存在一些问题,并且我不确定TrainImages['id'][i].astype('str')中的[i]在做什么认为是。

Also you probably don't need to use Pandas for this read. 同样,您可能不需要使用Pandas进行此阅读。 Since each line in your file is a path to an image, you could just use: 由于文件中的每一行都是图像的路径,因此您可以使用:

with open('client_train_raw.txt', 'r') as a_file:
    for idx, line in enumerate(a_file):
        # Each line will be a path to a data file.
        img = image.load_img('TrainImages/' + line + idx + '.png', ...)
        img = image.img_to_array(img)

or something, but I'm not sure what the idx here should be doing. 之类的,但是我不确定这里的idx应该做什么。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用python重命名文件夹中的图像名称? - How to rename image name in a folder using python? 如何使用Python对以数字文件名作为键的JSON文件名进行排序? - How do I sort a JSON file name that has numeric file names as the key using Python? 使用python将文件复制到名称与该文件几乎相同的文件夹 - Using python, copy a file to a folder that has almost the same name as the file 如何使用python获取最新创建的文件名而不是文件夹名? - How to get the latest created file name and not the folder name using python? 使用名称加载图像文件夹数据集 - Load image folder data set using name 如何选择具有该名称的Microsoft CRM表单上的Last Name元素 <input id = “lastname_i”…> 使用Python selenium驱动程序? - How do I select the Last Name element on Microsoft CRM form that has the name <input id = “lastname_i”…> using Python selenium driver? 如何使用python按名称加载文件夹中的所有图像? - How to load all images in folder by its name using python? 如何在python中获取文件夹名称和文件名 - how to get a folder name and file name in python 在 python 中,如何使用循环来命名熊猫数据帧? - In python, how can i use a loop to name panda data frames? 如果名称图像与 Python 中 csv 文件中的 image_id 匹配,则从文件夹中裁剪多张图像 - Crop multi image from folder if name image matches image_id in csv file in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM