[英]How to extract sub-folder path?
I have about 100 folders and in each folder files that should be read and analyzed. 我大约有100个文件夹,每个文件夹中的文件都应阅读和分析。
I can read the files from their subfolders, but I want to start processing at eg the 10th folder until the end. 我可以从子文件夹中读取文件,但是我想从第10个文件夹开始处理直到结束。 And I need the exact folder path. 我需要确切的文件夹路径。
How can I do this? 我怎样才能做到这一点?
To clarify my question, I extracted a sample from my code: 为了澄清我的问题,我从代码中提取了一个示例:
rootDir = 'D:/PhD/result/Pyradiomic_input/'
for (path, subdirs, files) in os.walk(rootDir):
sizefile=len(path)
if "TCGA-" in path :
print(path)
The output is: 输出为:
D:/PhD/result/Pyradiomic_input/TCGA-02-0006
D:/PhD/result/Pyradiomic_input/TCGA-02-0009
D:/PhD/result/Pyradiomic_input/TCGA-02-0011
D:/PhD/result/Pyradiomic_input/TCGA-02-0027
D:/PhD/result/Pyradiomic_input/TCGA-02-0046
D:/PhD/result/Pyradiomic_input/TCGA-02-0069
Now my question is how can I start working from eg D:/PhD/result/Pyradiomic_input/TCGA-02-0046
until the end, instead of starting from the top? 现在我的问题是如何从D:/PhD/result/Pyradiomic_input/TCGA-02-0046
开始直到结束,而不是从顶部开始? I tried some ideas but they did not work. 我尝试了一些想法,但没有成功。
You could set a flag to capture when you hit a specific directory 您可以设置一个标记来捕获特定目录时捕获
rootDir = 'D:/PhD/result/Pyradiomic_input/'
first_folder = 'TCGA-02-0046'
process = False
for (path, subdirs, files) in os.walk(rootDir):
sizefile=len(path)
if "TCGA-" in path :
print(path)
if first_folder in path:
process = True
if process:
#process folder
If you want a specific folder to indicate the script should stop processing 如果要特定的文件夹指示脚本应停止处理
rootDir = 'D:/PhD/result/Pyradiomic_input/'
first_folder = 'TCGA-02-0046'
last_folder = 'TCGA-02-0099'
process = False
for (path, subdirs, files) in os.walk(rootDir):
sizefile=len(path)
if "TCGA-" in path :
print(path)
if first_folder in path:
process = True
if last_folder in path:
break
if process:
#process folder
You can also set a list of directories that you want to process 您还可以设置要处理的目录列表
rootDir = 'D:/PhD/result/Pyradiomic_input/'
process_dirs = ['TCGA-02-0046', ...]
process = False
for (path, subdirs, files) in os.walk(rootDir):
sizefile=len(path)
if "TCGA-" in path :
print(path)
if any(d in path for d in process_dirs):
#process folder
You can simply skip the values you aren't interested in. Here a bit simplified: 您可以简单地跳过不需要的值。这里有些简化:
counter = 0
# mocking the file operations
for path in ["/dir-1", "/dir-2", "/dir-3", "/dir-4", "/dir-5"]:
# skip the first two paths
if counter < 2:
counter += 1
continue
# do something
print(path)
Alternatively you could collect the paths first, like this: 或者,您可以先收集路径,如下所示:
paths = []
# mocking the file operations
for path in ["/dir-1", "/dir-2", "/dir-3", "/dir-4", "/dir-5"]:
# collect paths in array
paths.append(path)
# skip the first two elements
paths = paths[2:]
for path in paths:
# do something
print(path)
The second version can become a bit shorter if you use generator expressions, but I favor readability. 如果使用生成器表达式,则第二个版本可能会短一些,但我更喜欢可读性。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.