[英]Chaining together Python Iterators
嘗試查看是否可以遞歸生成文件路徑的迭代器。 本質上,對於基本路徑列表和子目錄的有序列表,我想將所有子路徑生成為兩個輸入的組合。
即
base_path = ["/a", "/b"], subdir_lists = [ ["1", "2"], ["c", "d"] ]
那么輸出應該是
[ "/a", "/a/1", "/a/1/c", "/a/1/d", "a/2", "/a/2/c", "/a/2/d", "/b", "/b/1", ... "/b/2/d" ]
我的python代碼看起來像這樣。 我遞歸地調用appendpaths()。
def appendpaths(subdir_lists, base_path):
if not subdir_lists or len(subdir_lists) == 0:
return base_path
if len(subdir_lists) == 1:
return starmap(os.path.join, product(base_path, subdir_lists[0]))
right = subdir_lists[1:]
iter_list = [base_path, appendpaths(right, starmap(os.path.join, product(base_path, subdir_lists[0])))]
return chain(*iter_list)
def main():
subdir_lists = [["1", "2"], ["c", "d"]]
it = appendpaths(subdir_lists, ["/a", "/b"])
for x in it:
print(x)
main()
我的輸出缺少一些排列:
/a
/b
/a/1/c
/a/1/d
/a/2/c
/a/2/d
/b/1/c
/b/1/d
/b/2/c
/b/2/d
您會發現我缺少了/ a / 1,/ a / 2,/ b / 1和/ b / 2。 我猜是因為在代碼中的某個地方,我已經用盡了遍歷這些排列的生成器?
您將其復雜化了很多-如果您只希望一個連續的列表產品一個簡單的遞歸將先前加入的路徑(或基本路徑)合並在一起,則在每個遞歸中都需要更深一層:
import os
def append_paths(base, children):
paths = []
for e in base:
paths.append(e)
if children: # dig deeper
paths += append_paths([os.path.join(e, c) for c in children[0]], children[1:])
return paths
並對其進行測試:
base_path = ["/a", "/b"] # you might want to prepend with os.path.sep for cross-platform use
subdir_lists = [["1", "2"], ["c", "d"]]
print(append_paths(base_path, subdir_lists))
# ['/a', '/a/1', '/a/1/c', '/a/1/d', '/a/2', '/a/2/c', '/a/2/d',
# '/b', '/b/1', '/b/1/c', '/b/1/d', '/b/2', '/b/2/c', '/b/2/d']
給定
>>> import pathlib
>>> import itertools as it
>>> base = ["/a", "/b"]
>>> subdirs = [["1", "2"], ["c", "d"]]
輔助itertools配方:
>>> def powerset(iterable):
... "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
... s = list(iterable)
... return it.chain.from_iterable(it.combinations(s, r) for r in range(len(s)+1))
碼
>>> def subsequence(iterable, pred=None):
... """Return a non-contiguous subsequence."""
... if pred is None: pred = lambda x: x
... return (x for x in powerset(iterable) if x and pred(x))
>>> prods = list(it.product(base, subdirs[0], subdirs[1]))
>>> pred = lambda x: x[0].startswith("/")
>>> result = sorted(set(it.chain.from_iterable(subsequence(p, pred) for p in prods)))
>>> result
[('/a',),
('/a', '1'),
('/a', '1', 'c'),
('/a', '1', 'd'),
('/a', '2'),
('/a', '2', 'c'),
('/a', '2', 'd'),
('/a', 'c'),
('/a', 'd'),
('/b',),
('/b', '1'),
('/b', '1', 'c'),
('/b', '1', 'd'),
('/b', '2'),
('/b', '2', 'c'),
('/b', '2', 'd'),
('/b', 'c'),
('/b', 'd')]
應用領域
將路徑連接為字符串或pathlib
對象。
>>> ["/".join(x) for x in result];
['/a', '/a/1', '/a/1/c', ...]
>>> [pathlib.Path(*x) for x in result];
[WindowsPath('/a'), WindowsPath('/a/1'), WindowsPath('/a/1/c'), ...]
細節
腳步
prods
都是itertools.product
,它們接受可迭代並以類似於日期選擇器對話框應用程序的方式創建唯一的組合(或笛卡爾積)。 請參閱下面的示例。 subsequence
只是powerset
配方的包裝。 它允許一個pred
icate,其用於過濾與像那些由斜線開始resuts base
。 result
對為每個產品生成的一組平坦的子序列進行排序。 您可以根據需要選擇加入每個元素。 請參閱代碼-應用程序。 例子
這是笛卡爾積:
>>> prods
[('/a', '1', 'c'),
('/a', '1', 'd'),
('/a', '2', 'c'),
('/a', '2', 'd'),
('/b', '1', 'c'),
('/b', '1', 'd'),
('/b', '2', 'c'),
('/b', '2', 'd')]
如果沒有謂詞,則允許不需要的子序列:
>>> list(subsequence(prods[0]))
[('/a',),
('1',), # bad
('c',),
('/a', '1'),
('/a', 'c'),
('1', 'c' # bad
('/a', '1', 'c')]
因此,我們用謂詞pred
過濾不需要的元素。
>>> list(subsequence(prods[0], pred=pred))
[('/a',), ('/a', '1'), ('/a', 'c'), ('/a', '1', 'c')]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.