[英]How to select folders from a directory based on a python list of the folder names?
I have a list of folder names - "df_train_pos_list"我有一个文件夹名称列表 - “df_train_pos_list”
I want to iterate through a directory and select folders with those names, and add them to another list - "train_images"我想遍历一个目录和 select 个具有这些名称的文件夹,并将它们添加到另一个列表 - “train_images”
So far what I have tried doesn't work:到目前为止,我尝试过的方法不起作用:
train_images = []
train_labels = []
for i in df_train_pos_list:
for currentpath, folders, files in os.walk('D:\Arm C Deep Learning\SH_OCTAPUS\Train'):
for file in files:
if i in currentpath:
train_images.append('D:\Arm C Deep Learning\SH_OCTAPUS\Train' + file)
train_labels.append(1)
else:
train_images.append('D:\Arm C Deep Learning\SH_OCTAPUS\Train' + file)
train_labels.append(0)
train_labels = np.asarray(train_labels, dtype=np.int64)
print(train_labels)
np.unique(train_labels, return_counts='TRUE')
Kind of unsure if you want to add folder path to the list or the individual files in the folder to your list but the below snippet will add the folder paths to your trains_list
.有点不确定你是想将文件夹路径添加到列表还是文件夹中的单个文件到你的列表,但下面的代码片段会将文件夹路径添加到你的
trains_list
。 would need more details on what you want out of the label to add that.需要更多关于您想要从 label 中获得什么的详细信息才能添加。
import os
df_train_pos_list =[]
train_images = []
#train_labels = []
root = 'D:\Arm C Deep Learning\SH_OCTAPUS\Train'
for f in os.listdir(root):
if f in df_train_pos_list:
train_label = 1
else:
train_label = 0
train_images.append((os.path.join(root,f),train_label)) #this will add your folder file path to train images
for folder, label in train_images:
if label==1:
#do something here
From what I understood, you are trying to generate 2 lists: one containing all the paths in "D:\Arm C Deep Learning\SH_OCTAPUS\Train" and one containing 0s and 1s depending on whether a path is in df_train_pos_list
.据我了解,您正在尝试生成 2 个列表:一个包含 "D:\Arm C Deep Learning\SH_OCTAPUS\Train" 中的所有路径,另一个包含 0 和 1,具体取决于路径是否在
df_train_pos_list
中。
This should do the trick:这应该可以解决问题:
from pathlib import Path
df_train_pos_list = []
train_images = []
train_labels = []
df_train_pos_set = set(df_train_pos_list)
for path in Path("D:\Arm C Deep Learning\SH_OCTAPUS\Train").glob("*"):
train_images.append(path)
train_labels.append(1 if path.name in df_train_pos_set else 0)
A couple things to note:有几点需要注意:
pathlib
is best practice when dealing with the file system. pathlib
是处理文件系统的最佳实践。set
from your df_train_pos_list
to improve complexity.df_train_pos_list
创建一个set
以提高复杂性。 It will take O(N) time complexity to create the set
from the list
but it will take O(1) to check whether a path is in the set
whereas it would take O(N) using a list
.list
创建set
需要 O(N) 时间复杂度,但检查路径是否在set
中需要 O(1),而使用list
需要 O(N)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.