[英]Chaining together Python Iterators
Trying to see if I can recursively generate an iterator of file paths. 尝试查看是否可以递归生成文件路径的迭代器。 Essentially for a list of base paths and an ordered list of subdirs, I want to generate all child paths as a combination of the two inputs.
本质上,对于基本路径列表和子目录的有序列表,我想将所有子路径生成为两个输入的组合。
ie 即
base_path = ["/a", "/b"], subdir_lists = [ ["1", "2"], ["c", "d"] ]
then the output should be 那么输出应该是
[ "/a", "/a/1", "/a/1/c", "/a/1/d", "a/2", "/a/2/c", "/a/2/d", "/b", "/b/1", ... "/b/2/d" ]
My python code looks something like this. 我的python代码看起来像这样。 I'm calling appendpaths() recursively.
我递归地调用appendpaths()。
def appendpaths(subdir_lists, base_path):
if not subdir_lists or len(subdir_lists) == 0:
return base_path
if len(subdir_lists) == 1:
return starmap(os.path.join, product(base_path, subdir_lists[0]))
right = subdir_lists[1:]
iter_list = [base_path, appendpaths(right, starmap(os.path.join, product(base_path, subdir_lists[0])))]
return chain(*iter_list)
def main():
subdir_lists = [["1", "2"], ["c", "d"]]
it = appendpaths(subdir_lists, ["/a", "/b"])
for x in it:
print(x)
main()
My output is missing a few permutations: 我的输出缺少一些排列:
/a
/b
/a/1/c
/a/1/d
/a/2/c
/a/2/d
/b/1/c
/b/1/d
/b/2/c
/b/2/d
You can see that I'm missing /a/1, /a/2, /b/1 and /b/2. 您会发现我缺少了/ a / 1,/ a / 2,/ b / 1和/ b / 2。 I'm guessing it's because somewhere in my code I've already exhausted the generators that iterate through those permutations?
我猜是因为在代码中的某个地方,我已经用尽了遍历这些排列的生成器?
You're complicating this a bit too much - if you just want a consecutive list product a simple recursion to merge together previously joined paths (or the base), moving one level deeper in each recursion all you need: 您将其复杂化了很多-如果您只希望一个连续的列表产品一个简单的递归将先前加入的路径(或基本路径)合并在一起,则在每个递归中都需要更深一层:
import os
def append_paths(base, children):
paths = []
for e in base:
paths.append(e)
if children: # dig deeper
paths += append_paths([os.path.join(e, c) for c in children[0]], children[1:])
return paths
And to test it: 并对其进行测试:
base_path = ["/a", "/b"] # you might want to prepend with os.path.sep for cross-platform use
subdir_lists = [["1", "2"], ["c", "d"]]
print(append_paths(base_path, subdir_lists))
# ['/a', '/a/1', '/a/1/c', '/a/1/d', '/a/2', '/a/2/c', '/a/2/d',
# '/b', '/b/1', '/b/1/c', '/b/1/d', '/b/2', '/b/2/c', '/b/2/d']
Given 给定
>>> import pathlib
>>> import itertools as it
>>> base = ["/a", "/b"]
>>> subdirs = [["1", "2"], ["c", "d"]]
A helper itertools recipe: 辅助itertools配方:
>>> def powerset(iterable):
... "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
... s = list(iterable)
... return it.chain.from_iterable(it.combinations(s, r) for r in range(len(s)+1))
Code 码
>>> def subsequence(iterable, pred=None):
... """Return a non-contiguous subsequence."""
... if pred is None: pred = lambda x: x
... return (x for x in powerset(iterable) if x and pred(x))
>>> prods = list(it.product(base, subdirs[0], subdirs[1]))
>>> pred = lambda x: x[0].startswith("/")
>>> result = sorted(set(it.chain.from_iterable(subsequence(p, pred) for p in prods)))
>>> result
[('/a',),
('/a', '1'),
('/a', '1', 'c'),
('/a', '1', 'd'),
('/a', '2'),
('/a', '2', 'c'),
('/a', '2', 'd'),
('/a', 'c'),
('/a', 'd'),
('/b',),
('/b', '1'),
('/b', '1', 'c'),
('/b', '1', 'd'),
('/b', '2'),
('/b', '2', 'c'),
('/b', '2', 'd'),
('/b', 'c'),
('/b', 'd')]
Applications 应用领域
Join paths as strings or pathlib
objects. 将路径连接为字符串或
pathlib
对象。
>>> ["/".join(x) for x in result];
['/a', '/a/1', '/a/1/c', ...]
>>> [pathlib.Path(*x) for x in result];
[WindowsPath('/a'), WindowsPath('/a/1'), WindowsPath('/a/1/c'), ...]
Details 细节
Steps 脚步
prods
are all itertools.product
s , which accept iterables and create unique combinations (or Cartesian products) in a manner analogous to a date picker dialog application . prods
都是itertools.product
,它们接受可迭代并以类似于日期选择器对话框应用程序的方式创建唯一的组合(或笛卡尔积)。 See examples below. subsequence
is simply a wrapper of the powerset
itertools recipe . subsequence
只是powerset
配方的包装。 It allows a pred
icate, which is used to filter resuts that start with slashes like those from base
. pred
icate,其用于过滤与像那些由斜线开始resuts base
。 result
sorts a flattened set of subsequences generated for each product. result
对为每个产品生成的一组平坦的子序列进行排序。 You can optionally join each element as desired. Examples 例子
Here are the Cartesian products: 这是笛卡尔积:
>>> prods
[('/a', '1', 'c'),
('/a', '1', 'd'),
('/a', '2', 'c'),
('/a', '2', 'd'),
('/b', '1', 'c'),
('/b', '1', 'd'),
('/b', '2', 'c'),
('/b', '2', 'd')]
Without a predicate, undesired subsequences are permitted: 如果没有谓词,则允许不需要的子序列:
>>> list(subsequence(prods[0]))
[('/a',),
('1',), # bad
('c',),
('/a', '1'),
('/a', 'c'),
('1', 'c' # bad
('/a', '1', 'c')]
Thus, we filter unwanted elements with the predicate, pred
. 因此,我们用谓词
pred
过滤不需要的元素。
>>> list(subsequence(prods[0], pred=pred))
[('/a',), ('/a', '1'), ('/a', 'c'), ('/a', '1', 'c')]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.