[英]File generator to get files from leaf folders ignoring hidden folders
I have a folder structure with some epubs and json files in the down-most folders (not counting the .ts
folders).我在最下面的文件夹中有一个包含一些 epub 和 json 文件的文件夹结构(不包括
.ts
文件夹)。 I'm exporting tags from the json files to tagspaces, by creating a .ts
folder with other json files.我通过使用其他 json 文件创建一个
.ts
文件夹,将标签从 json 文件导出到标签空间。 I've already processed part of the files and now I want to find the json files in the leaf folders that don't have a .ts
folder in their path, so that I don't have to process the same files twice.我已经处理了部分文件,现在我想在路径中没有
.ts
文件夹的叶子文件夹中找到 json 文件,这样我就不必处理相同的文件两次。
I want to process the files in the directories as I find them instead of getting a list of all the files and then looping through them.我想在找到它们时处理目录中的文件,而不是获取所有文件的列表然后循环遍历它们。 Which is why I want to make a generator.
这就是为什么我想做一个发电机。
On this example I should be getting the file test/t1/t2/test.json
as the result but I'm getting test/t1/test.json
instead.在这个例子中,我应该得到文件
test/t1/t2/test.json
作为结果,但我得到的是test/t1/test.json
。 Which is wrong because t1
is not a leaf folder.这是错误的,因为
t1
不是叶子文件夹。
test
├── t1
│ ├── t2
│ │ └── test.json
│ ├── test.json
│ └── .ts
│ └── test.json
└── .ts
└── t3
└── test.json
This is what I've tried:这是我尝试过的:
def file_generator(path: str) -> List[str]:
for root, subdirs, filenames in os.walk(path):
# If only hidden folders left, ignore current folder
if all([d[0] == '.' for d in subdirs]):
continue
# Ignore hidden subfolders
subdirs[:] = [d for d in subdirs if d[0] != '.']
# Return files in current folder
for filename in filenames:
if filename.endswith('.json'):
meta_file = os.path.join(root, filename)
yield meta_file
def test_file_generator():
try:
os.makedirs('test/t1/t2', exist_ok=True)
os.makedirs('test/t1/.ts', exist_ok=True)
os.makedirs('test/.ts/t3', exist_ok=True)
Path('test/t1/t2/test.json').touch()
Path('test/t1/test.json').touch()
Path('test/t1/.ts/test.json').touch()
Path('test/.ts/t3/test.json').touch()
gen = file_generator('test')
assert tuple(gen) == ('test/t1/t2/test.json',)
finally:
shutil.rmtree('test')
So you reversed the condition: you only skip over leaf folders, rather than anything else.所以你颠倒了条件:你只跳过叶子文件夹,而不是其他任何东西。 And you skip at the wrong time, because if you're not in a leaf folder you'll still want to remove all the hidden folders.
而且您在错误的时间跳过,因为如果您不在叶子文件夹中,您仍然希望删除所有隐藏文件夹。
from typing import Iterator
# You don't actually return a list, so I changed it so it typechecks!
def file_generator(path: str) -> Iterator[str]:
for root, subdirs, filenames in os.walk(path):
# Ignore hidden subfolders
subdirs[:] = [d for d in subdirs if d[0] != '.']
# If any subfolders are left, ignore current folder
if subdirs:
continue
# Yield files in current folder
for filename in filenames:
if filename.endswith('.json'):
meta_file = os.path.join(root, filename)
yield meta_file
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.