简体   繁体   English

分层嵌套列表中的元组列表

[英]List of tuples from heirarchical nested lists

Having an outer list of inner elements, each said inner element being a flat/nested list. 具有内部元素的外部列表,每个所述内部元素是平面/嵌套列表。 Each said inner list has a nesting structure that matches the inner list in the preceding outer cell. 每个所述内部列表具有与先前的外部单元格中的内部列表匹配的嵌套结构。 Meaning that each primitive value in a list either corresponds to a primitive value or to a list - in the following cell list (recursively applied). 意味着列表中的每个原始值要么对应于原始值,要么对应于列表-在以下单元格列表中(递归应用)。 Thus, each inner list has a depth that is equal to or exceeds by 1 the depth of the element in the preceding cell. 因此,每个内部列表的深度等于或超过前一个单元格中元素的深度的1。

(Notice that the first cell element can start as a nested list of any depth). (请注意,第一个单元格元素可以从任何深度的嵌套列表开始)。

Example of the above: 以上示例:

[
    [[1, 2,      [3, 4]], 1           ],
    [[3, [4, 5], [6, 7]], [5, 4]      ],   
    [[5, [6, 7], [8, 9]], [7, [8, 6]] ],
]

It is desired to unfold the nested lists into a list of tuples, where each value is combined either with the parent value, or the corresponding list element if parent is list (with order being maintained). 希望将嵌套列表展开为元组列表,其中每个值都与父级值组合,或者如果父级为list(保持顺序)则将相应的list元素组合在一起。 So for the above example list, the output should be: 因此,对于上面的示例列表,输出应为:

[
(1, 3, 5),
(2, 4, 6),
(2, 5, 7),
(3, 6, 8),
(4, 7, 9),
(1, 5, 7),
(1, 4, 8),
(1, 4, 6),
]

Note: this question is an expansion on a previous question here , but unlike the linked question, the desired tuples here are flat. 注意:此问题是此处上一个问题的扩展,但与链接的问题不同,此处所需的元组是平坦的。

Ok, how about this: 好吧,这呢:

x = [
    [[1, 2,      [3, 4]], 1           ],
    [[3, [4, 5], [6, 7]], [5, 4]      ],   
    [[5, [6, 7], [8, 9]], [7, [8, 6]] ],
]

from collections import defaultdict

def g(x):
    paths = defaultdict(lambda: [])

    def calculate_paths(item, counts):
        if type(item) is list:
            for i, el in enumerate(item):
                calculate_paths(el, counts + (i,))
        else:
            paths[counts].append(item)

    def recreate_values(k, initial_len, desired_len):
        if len(paths[k]) + initial_len == desired_len:
            yield paths[k]
        else:
            for ks in keysets:
                if len(ks) > len(k) and ks[0:len(k)] == k:
                    for ks1 in recreate_values(ks, initial_len + len(paths[k]), desired_len):
                        yield paths[k] + ks1

    for lst in x:
        calculate_paths(lst, (0,))
    keysets = sorted(list(paths.keys()))
    for k in keysets:
        yield from recreate_values(k, 0, len(x))


>>> import pprint
>>> pprint.pprint(list(g(x)))
[[1, 3, 5],
 [2, 4, 6],
 [2, 5, 7],
 [3, 6, 8],
 [4, 7, 9],
 [1, 5, 7],
 [1, 4, 8],
 [1, 4, 6]]

Works by creating a "path" for each number in the structure, which is a tuple which identifies how it fits in its particular row. 通过为结构中的每个数字创建一个“路径”来工作,该路径是一个元组,用于标识其如何适合其特定行。


(Original attempt): (原始尝试):

If it's always three levels, then something like this? 如果始终是三个级别,那么是这样吗?

def listify(lst):
    max_len = max(len(item) if type(item) is list else 1 for item in lst)
    yield from zip(*[item if type(item) is list else [item] * max_len for item in lst])


def f():
    for i in listify(x):
        for j in listify(i):
            for k in listify(j):
                yield k

>>> list(f())

This is one heck of a problem to solve :-) 这是要解决的一个难题:-)

I did managed to get the solution for different levels as expected. 我确实设法按预期获得了不同级别的解决方案。 However, I made one assumption to do that: 但是,我做一个假设来做到这一点:

  • The last column of the input is the pointer to other columns 输入的最后一列是指向其他列的指针

If that is no issue, the following solution will work fine :-) 如果那没问题,以下解决方案将正常工作:-)

input = [
    [[1, 2,      [3, 4]], 1           ],
    [[3, [4, 5], [6, 7]], [5, 4]      ],
    [[5, [6, 7], [8, 9]], [7, [8, 6]] ],
]

def level_flatten(level):
    """
    This method compares the elements and their types of last column and
    makes changes to other columns accordingly
    """
    for k, l in level.items():
        size = len(l[-1]) if isinstance(l[-1], list) else 1
        # Mostly l[-1] is going to be a list; this is for just in case
        elements = l[-1]
        for i in range(-1, -len(l)-1, -1):
            elem = l[i]
            if isinstance(l[i], int):
                l[i] = [elem] * size
            else:
                for j in range(len(elem)):
                    if not isinstance(elem[j], type(elements[j])):
                        # For a list found in elements[j], there is a int at l[i][j]
                        elem[j] = [elem[j]] * len(elements[j])
    return level

level = {}

for i in range(len(input[0])):
    level[i] = []
    for j in input:
        level[i].append(j[i])

for k, l in level.items():
    for i in range(len(l[-1])):
        level = level_flatten(level)

    total_flat = []
    for item in l:
        row = []
        for x in item:
            if isinstance(x, list):
                row += x
            else:
                row.append(x)
        total_flat.append(row)
    level[k] = total_flat

output_list = []
for i in range(len(level)):# For maintaining the order
    output_list += zip(*level[i])

print output_list

I know this is not a pretty solution and could be optimized further. 我知道这不是一个不错的解决方案,可以进一步优化。 I am trying to think of a better algorithm than this. 我正在尝试一种比这更好的算法。 Will update if I gets to a better solution :-) 如果我能找到更好的解决方案,它将更新:-)

I first tried to solve this using a 2d matrix but turned out it's simpler to iterate over the last row dividing the column segments above it: 我首先尝试使用2d矩阵解决此问题,但事实证明,遍历最后一行将上面的列分段划分起来更简单:

def unfold(ldata):
    ''' 
    ldata: list of hierarchical lists.
    technique: repeatedly flatten bottom row one level at a time, unpacking lists or
    adding repeats in the column above at the same time. 
    convention: n=1 are primitives, n>=2 are lists.
    '''

    has_lists = True
    while has_lists:
        has_lists = False 
        for i, elm in enumerate(ldata[-1]):
            if type(elm) is list:
                has_lists = True
                ldata[-1][i:i+1] = ldata[-1][i] # unpack
                for k in range(0, len(ldata)-1): # over corresponding items in above column
                    if type(ldata[k][i]) is list:
                        ldata[k][i:i+1] = ldata[k][i] # unpack
                    else:
                        ldata[k][i:i+1] = [ldata[k][i]]*len(elm) # add repeats
    return list(zip(*ldata))            

x = [
    [[1, 2,      [3, 4]], 1           ],
    [[3, [4, 5], [6, 7]], [5, 4]      ],   
    [[5, [6, 7], [8, 9]], [7, [8, 6]] ],
]

from pprint import pprint
pprint(unfold(x))

>>>
[(1, 3, 5),
 (2, 4, 6),
 (2, 5, 7),
 (3, 6, 8),
 (4, 7, 9),
 (1, 5, 7),
 (1, 4, 8),
 (1, 4, 6)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM