简体   繁体   English

将Python迭代器链接在一起

[英]Chaining together Python Iterators

Trying to see if I can recursively generate an iterator of file paths. 尝试查看是否可以递归生成文件路径的迭代器。 Essentially for a list of base paths and an ordered list of subdirs, I want to generate all child paths as a combination of the two inputs. 本质上,对于基本路径列表和子目录的有序列表,我想将所有子路径生成为两个输入的组合。

ie

base_path = ["/a", "/b"], subdir_lists = [ ["1", "2"], ["c", "d"] ] 

then the output should be 那么输出应该是

[ "/a", "/a/1", "/a/1/c", "/a/1/d", "a/2", "/a/2/c", "/a/2/d", "/b", "/b/1", ... "/b/2/d" ]

My python code looks something like this. 我的python代码看起来像这样。 I'm calling appendpaths() recursively. 我递归地调用appendpaths()。

def appendpaths(subdir_lists, base_path):
        if not subdir_lists or len(subdir_lists) == 0:
                return base_path
        if len(subdir_lists) == 1:
                return starmap(os.path.join, product(base_path, subdir_lists[0]))
        right = subdir_lists[1:]
        iter_list = [base_path, appendpaths(right, starmap(os.path.join, product(base_path, subdir_lists[0])))]
        return chain(*iter_list)


def main():
        subdir_lists = [["1", "2"], ["c", "d"]]
        it = appendpaths(subdir_lists, ["/a", "/b"])
        for x in it:
                print(x)
main()

My output is missing a few permutations: 我的输出缺少一些排列:

/a
/b
/a/1/c
/a/1/d
/a/2/c
/a/2/d
/b/1/c
/b/1/d
/b/2/c
/b/2/d

You can see that I'm missing /a/1, /a/2, /b/1 and /b/2. 您会发现我缺少了/ a / 1,/ a / 2,/ b / 1和/ b / 2。 I'm guessing it's because somewhere in my code I've already exhausted the generators that iterate through those permutations? 我猜是因为在代码中的某个地方,我已经用尽了遍历这些排列的生成器?

You're complicating this a bit too much - if you just want a consecutive list product a simple recursion to merge together previously joined paths (or the base), moving one level deeper in each recursion all you need: 您将其复杂化了很多-如果您只希望一个连续的列表产品一个简单的递归将先前加入的路径(或基本路径)合并在一起,则在每个递归中都需要更深一层:

import os

def append_paths(base, children):
    paths = []
    for e in base:
        paths.append(e)
        if children:  # dig deeper
            paths += append_paths([os.path.join(e, c) for c in children[0]], children[1:])
    return paths

And to test it: 并对其进行测试:

base_path = ["/a", "/b"]  # you might want to prepend with os.path.sep for cross-platform use
subdir_lists = [["1", "2"], ["c", "d"]]

print(append_paths(base_path, subdir_lists))
# ['/a', '/a/1', '/a/1/c', '/a/1/d', '/a/2', '/a/2/c', '/a/2/d',
#  '/b', '/b/1', '/b/1/c', '/b/1/d', '/b/2', '/b/2/c', '/b/2/d']

Given 给定

>>> import pathlib
>>> import itertools as it

>>> base = ["/a", "/b"]
>>> subdirs = [["1", "2"], ["c", "d"]] 

A helper itertools recipe: 辅助itertools配方:

>>> def powerset(iterable):
...     "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
...     s = list(iterable)
...     return it.chain.from_iterable(it.combinations(s, r) for r in range(len(s)+1))

Code

>>> def subsequence(iterable, pred=None):
...     """Return a non-contiguous subsequence."""
...     if pred is None: pred = lambda x: x
...     return (x for x in powerset(iterable) if x and pred(x))


>>> prods = list(it.product(base, subdirs[0], subdirs[1]))
>>> pred = lambda x: x[0].startswith("/")
>>> result = sorted(set(it.chain.from_iterable(subsequence(p, pred) for p in prods)))
>>> result
[('/a',),
 ('/a', '1'),
 ('/a', '1', 'c'),
 ('/a', '1', 'd'),
 ('/a', '2'),
 ('/a', '2', 'c'),
 ('/a', '2', 'd'),
 ('/a', 'c'),
 ('/a', 'd'),
 ('/b',),
 ('/b', '1'),
 ('/b', '1', 'c'),
 ('/b', '1', 'd'),
 ('/b', '2'),
 ('/b', '2', 'c'),
 ('/b', '2', 'd'),
 ('/b', 'c'),
 ('/b', 'd')]

Applications 应用领域

Join paths as strings or pathlib objects. 将路径连接为字符串或pathlib对象。

>>> ["/".join(x) for x in result];
['/a', '/a/1', '/a/1/c', ...]

>>> [pathlib.Path(*x) for x in result];
[WindowsPath('/a'), WindowsPath('/a/1'), WindowsPath('/a/1/c'), ...]

Details 细节

Steps 脚步

  1. prods are all itertools.product s , which accept iterables and create unique combinations (or Cartesian products) in a manner analogous to a date picker dialog application . prods都是itertools.product ,它们接受可迭代并以类似于日期选择器对话框应用程序的方式创建唯一的组合(或笛卡尔积)。 See examples below. 请参阅下面的示例。
  2. subsequence is simply a wrapper of the powerset itertools recipe . subsequence只是powerset配方的包装。 It allows a pred icate, which is used to filter resuts that start with slashes like those from base . 它允许一个pred icate,其用于过滤与像那些由斜线开始resuts base
  3. result sorts a flattened set of subsequences generated for each product. result对为每个产品生成的一组平坦的子序列进行排序。 You can optionally join each element as desired. 您可以根据需要选择加入每个元素。 See Code - Applications. 请参阅代码-应用程序。

Examples 例子

Here are the Cartesian products: 这是笛卡尔积:

>>> prods
[('/a', '1', 'c'),
 ('/a', '1', 'd'),
 ('/a', '2', 'c'),
 ('/a', '2', 'd'),
 ('/b', '1', 'c'),
 ('/b', '1', 'd'),
 ('/b', '2', 'c'),
 ('/b', '2', 'd')]

Without a predicate, undesired subsequences are permitted: 如果没有谓词,则允许不需要的子序列:

>>> list(subsequence(prods[0]))
[('/a',),
 ('1',),                                                 # bad
 ('c',),
 ('/a', '1'),                           
 ('/a', 'c'),
 ('1', 'c'                                               # bad
 ('/a', '1', 'c')]

Thus, we filter unwanted elements with the predicate, pred . 因此,我们用谓词pred过滤不需要的元素。

>>> list(subsequence(prods[0], pred=pred))
[('/a',), ('/a', '1'), ('/a', 'c'), ('/a', '1', 'c')]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM