简体   繁体   English

如何制作未知级别的嵌套列表,然后将其组合作为分隔列表

[英]how to make an unknown level nested list and then get combinations of it as separated lists

I want to get all different combinations of a nested list which level is unknown.我想获得级别未知的嵌套列表的所有不同组合。

I looked for the solution but all similar problems are not exactly what I'm looking for.我寻找了解决方案,但所有类似的问题都不是我正在寻找的。

Consider that the levels may be very depth than this eg for 5 or 6 level inner.考虑到这些级别可能比这更深,例如对于 5 或​​ 6 级内部。

The main problem is implementing a backpointer for CKY algorithm to get all possible syntactic trees, which is a nested list with REALLY unknown levels!!!主要问题是为 CKY 算法实现一个反向指针来获取所有可能的句法树,这是一个具有真正未知级别的嵌套列表!!!

I have a backpointer as this:我有一个反向指针:

backpointer = {
    (0, 2, 'NP'): {
        (1, 'AD', 'NP')
    }, 
    (1, 3, 'X1'): {
        (2, 'NP', 'PA')
    }, 
    (1, 3, 'NP'): {
        (2, 'NP', 'NP')
    }, (0, 3, 'X1'): {
        (2, 'NP', 'PA'), 
        (1, 'DT', 'NP')
    }, 
    (2, 4, 'X2'): {
        (3, 'PA', 'VP')
    }, 
    (1, 4, 'S'): {
        (2, 'NP', 'X2'), 
        (3, 'X1', 'VP')
    }, 
    (0, 4, 'S'): {
        (2, 'NP', 'X2'), 
        (3, 'X1', 'VP')
    }
}

which I backward from (0, 4, 'S') by considering all possible ways.我通过考虑所有可能的方式从 (0, 4, 'S') 向后。

My current output is like this which is not classified.:我目前的输出是这样的,没有分类。:

[
    (0, 4, 'S'), (0, 3, 'X1'), (0, 2, 'NP'), (0, 1, 'AD'), (1, 2, 'NP'), (2, 3, 'PA'), 
    (0, 1, 'DT'), (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP'), (3, 4, 'VP'), (0, 2, 'NP'), 
    (0, 1, 'AD'), (1, 2, 'NP'), (2, 4, 'X2'), (2, 3, 'PA'), (3, 4, 'VP')
]

And I am trying to get it as a nested list like below to make it classified我试图将它作为如下嵌套列表来分类

[
    (0, 4, 'S'), 
    [
        (0, 2, 'NP'), (2, 4, 'X2'), (0, 1, 'AD'), (1, 2, 'NP'), (2, 3, 'PA'), (3, 4, 'VP')
    ], 
    [
        (0, 3, 'X1'), 
        (3, 4, 'VP'), 
        [
            (0, 2, 'NP'), (2, 3, 'PA'), (0, 1, 'AD'), (1, 2, 'NP')
        ], 
        [
            (0, 1, 'AD'), (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP')
        ]
    ]
]

and then show it to user as some lists for each possible unique tree.然后将其作为每个可能的唯一树的一些列表显示给用户。

[
    (0, 4, 'S'), (0, 2, 'NP'), (2, 4, 'X2'), (0, 1, 'AD'), 
    (1, 2, 'NP'), (2, 3, 'PA'), (3, 4, 'VP')
]

[
    (0, 4, 'S'), (0, 3, 'X1'),(3, 4, 'VP'), (0, 2, 'NP'), 
    (2, 3, 'PA'), (0, 1, 'AD'), (1, 2, 'NP')
]

[
    (0, 4, 'S'), (0, 3, 'X1'),(3, 4, 'VP'), (0, 1, 'AD'), 
    (1, 3, 'NP'), (1, 2, 'NP'), (2, 3, 'NP')
]

IIUC, you can first write a recursive generator to "un-nest" your nested list s. IIUC,您可以先编写一个递归生成器来“取消嵌套”您的嵌套list Here's a quick and dirty approach 1 :这是一个快速而肮脏的方法1

def unnest(lst, append=False):
    chunk = []
    for x in lst:
        if isinstance(x, list):
            if chunk:
                yield chunk
            yield from unnest(x, True)
            chunk = []
        else:
            if append:
                chunk.append(x)
            else:
                yield [x]
    if chunk:
        yield chunk

lst = [0, 1, [2, 3, 4, [5, 6]], 7, [8, 9]]  # per original question
print(list(unnest(lst)))
#[[0], [1], [2, 3, 4], [5, 6], [7], [8, 9]]

Now use itertools.product to get the desired combination of elements:现在使用itertools.product来获得所需的元素组合:

from itertools import product
print(list(product(*unnest(lst))))
#[(0, 1, 2, 5, 7, 8),
# (0, 1, 2, 5, 7, 9),
# (0, 1, 2, 6, 7, 8),
# (0, 1, 2, 6, 7, 9),
# (0, 1, 3, 5, 7, 8),
# (0, 1, 3, 5, 7, 9),
# (0, 1, 3, 6, 7, 8),
# (0, 1, 3, 6, 7, 9),
# (0, 1, 4, 5, 7, 8),
# (0, 1, 4, 5, 7, 9),
# (0, 1, 4, 6, 7, 8),
# (0, 1, 4, 6, 7, 9)]

Notes :注意事项

  1. yield from only works in python 3 yield from仅适用于 python 3

It works well, without making nested list.它运行良好,无需制作嵌套列表。 Directly retrieve all possible trees from the root.直接从根检索所有可能的树。

backpointer = {}
rules_prob = {}
syn_tags = []
syn_probs = []

def get_syn_tree(bp, ix):

    if bp[1] - bp[0] == 1:  # end - start = 1   (2, 3, N)
        return

    current_ix = ix
    for i in range(len(backpointer[bp])-1):
        syn_tags.append(syn_tags[current_ix].copy())

    counter = 0
    for item in list(backpointer[bp]):
        # (0, 6, S) -> (4, N,VP) =>     (0, 4, N) , (4, 6, VP)
        syn_tags[current_ix + counter].add((bp[0], item[0], item[1]))
        syn_tags[current_ix + counter].add((item[0], bp[1], item[2]))

        get_syn_tree((bp[0], item[0], item[1]), current_ix + counter)
        get_syn_tree((item[0], bp[1], item[2]), current_ix + counter)

        counter += 1

syn_tags.append({(start, end, A)})
syn_probs.append(0)  # for + = 0, for × = 1
get_syn_tree((start, end, A), 0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM