简体   繁体   English

从嵌套的python字典生成所有组合并将其隔离

[英]Generate all combinations from a nested python dictionary and segregate them

My sample dict is: 我的示例字典是:

sample_dict = {
    'company': {
        'employee': {
            'name': [
                {'explore': ["noname"],
                 'valid': ["john","tom"],
                 'boundary': ["aaaaaaaaaa"],
                 'negative': ["$"]}],
            'age': [
                {'explore': [200],
                 'valid': [20,30],
                 'boundary': [1,99],
                 'negative': [-1,100]}],
            'others':{
                'grade':[
                    {'explore': ["star"],
                     'valid': ["A","B"],
                     'boundary': ["C"],
                     'negative': ["AB"]}]}
    }
}}

Its a "follow-on" question to-> Split python dictionary to result in all combinations of values 它是一个“后续”问题,它会-> 拆分python字典以导致值的所有组合
I would like to get a segregated list of combinations like below 我想获得如下组合的隔离列表

Valid combinations:[generate only out of valid list of data] 有效组合:[仅从有效数据列表中生成]
COMPLETE OUTPUT for VALID CATEGORY : 有效类别的完整输出:

{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'B'}}}
{'company': {'employee': {'age': 20}, 'name': 'tom', 'others': {'grade': 'A'}}} 
{'company': {'employee': {'age': 20}, 'name': 'tom', 'others': {'grade': 'B'}}} 
{'company': {'employee': {'age': 30}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': 'john', 'others': {'grade': 'B'}}}
{'company': {'employee': {'age': 30}, 'name': 'tom', 'others': {'grade': 'A'}}} 
{'company': {'employee': {'age': 30}, 'name': 'tom', 'others': {'grade': 'B'}}}

Negative combinations : [Here its bit tricky because, negative combinations should be combined with "valid" pool as well with atleast only value being negative] 负数组合:[这里有些棘手,因为负数组合还应与“有效”池组合,并且至少应为负数。
Complete output expected for NEGATIVE category : 预计将为“负”类别的完整输出:
=>[Basically, excluding combinations where all values are valid - ensuring atleast one value in the combination is from negative group] => [基本上,排除所有值均有效的组合-确保组合中至少有一个值来自负数组]

{'company': {'employee': {'age': 20}, 'name': 'john', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': -1}, 'name': 'tom', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 100}, 'name': 'john', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': '$', 'others': {'grade': 'A'}}}
{'company': {'employee': {'age': 30}, 'name': '$', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': -1}, 'name': '$', 'others': {'grade': 'AB'}}}
{'company': {'employee': {'age': 100}, 'name': '$', 'others': {'grade': 'AB'}}}

In the above output, in the first line, grade is tested for negative value AB by keeping remaining all valid. 在上面的输出中,在第一行中,通过保持其余所有有效值来测试等级的负值AB。 So its not necessary to generate the same with age as 30 as the intent is to test only negative set. 因此,不必生成与30岁相同的值,因为目的是仅测试否定集。 We can supply the remaining parameters with any valid data. 我们可以为其余参数提供任何有效数据。


Boundary Combinations is similar to valid -> Combinations for all values within the boundary pool only 边界组合类似于有效->仅边界池内所有值的组合
Explore : Similar to negative - Mix with valid pool and always atleast one explore value in all combinations. 探索:类似于否定-与有效池混合,并且在所有组合中始终至少具有一个探索值。

Sample dict - revised version dict样本-修订版

sample_dict2 = {
    'company': {
        'employee_list': [
            {'employee': {'age': [{'boundary': [1,99],
                                   'explore': [200],
                                   'negative': [-1,100],
                                   'valid': [20, 30]}],
                          'name': [{'boundary': ['aaaaaaaaaa'],
                                    'explore': ['noname'],
                                    'negative': ['$'],
                                    'valid': ['john','tom']}],
                          'others': {
                              'grade': [
                                  {'boundary': ['C'],
                                   'explore': ['star'],
                                   'negative': ['AB'],
                                   'valid': ['A','B']},
                                  {'boundary': ['C'],
                                   'explore': ['star'],
                                   'negative': ['AB'],
                                   'valid': ['A','B']}]}}},
            {'employee': {'age': [{'boundary': [1, 99],
                                   'explore': [200],
                                   'negative': [],
                                   'valid': [20, 30]}],
                          'name': [{'boundary': [],
                                    'explore': [],
                                    'negative': ['$'],
                                    'valid': ['john', 'tom']}],
                          'others': {
                              'grade': [
                                  {'boundary': ['C'],
                                   'explore': ['star'],
                                   'negative': [],
                                   'valid': ['A', 'B']},
                                  {'boundary': [],
                                   'explore': ['star'],
                                   'negative': ['AB'],
                                   'valid': ['A', 'B']}]}}}
        ]
    }
}

The sample_dict2 contains list of dicts. sample_dict2包含字典列表。 Here "employee" the whole hierarchy is a list element and also leaf node "grade" is a list 在这里,“雇员”是整个层次结构的一个列表元素,而叶节点“等级”是一个列表
Also, except "valid" and "boundary" other data set can be empty - [] and we need to handle them as well. 另外,除了“有效”和“边界”外,其他数据集也可以为空-[],我们也需要处理它们。
VALID COMBINATIONS will be like 有效组合将像

{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','A']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'john', 'others': {'grade': ['A','B']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','A']}}]}}
{'company': {'employee_list':[{'employee': {'age': 20}, 'name': 'john', 'others': {'grade': ['A','B']}},{'employee': {'age': 1}, 'name': 'tom', 'others': {'grade': ['A','B']}}]}}

plus combinations of age=30 and name =tom in employee index 0 加上员工索引0中的age = 30和name = tom的组合

import itertools

def generate_combinations(thing, positive="valid", negative=None):

    """ Generate all possible combinations, walking and mimicking structure of "thing" """

    if isinstance(thing, dict):  # if dictionary, distinguish between two types of dictionary
        if positive in thing:
            return thing[positive] if negative is None else [thing[positive][0]] + thing[negative]
        else:
            results = []
            for key, value in thing.items():  # generate all possible key: value combinations
                subresults = []
                for result in generate_combinations(value, positive, negative):
                    subresults.append((key, result))
                results.append(subresults)
            return [dict(result) for result in itertools.product(*results)]

    elif isinstance(thing, list) or isinstance(thing, tuple):  # added tuple just to be safe
        results = []
        for element in thing:  # generate recursive result sets for each element of list
            for result in generate_combinations(element, positive, negative):
                results.append(result)
        return results

    else:  # not a type we know how to handle
        raise TypeError("Unexpected type")


def generate_invalid_combinations(thing):

    """ Generate all possible combinations and weed out the valid ones """

    valid = generate_combinations(thing)

    return [result for result in generate_combinations(thing, negative='negative') if result not in valid]


def generate_boundary_combinations(thing):

    """ Generate all possible boundary combinations """

    return generate_combinations(thing, positive="boundary")


def generate_explore_combinations(thing):

    """ Generate all possible explore combinations and weed out the valid ones """

    valid = generate_combinations(thing)

    return [result for result in generate_combinations(thing, negative='explore') if result not in valid]

Calling generate_combinations(sample_dict) returns: 调用generate_combinations(sample_dict)返回:

[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 20, 'name': 'tom', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'tom', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 30, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 30, 'name': 'john', 'others': {'grade': 'B'}}}},
{'company': {'employee': {'age': 30, 'name': 'tom', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 30, 'name': 'tom', 'others': {'grade': 'B'}}}}
]

Calling generate_invalid_combinations(sample_dict) returns: 调用generate_invalid_combinations(sample_dict)返回:

[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 20, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': '$', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': -1, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': -1, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': -1, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': -1, 'name': '$', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 100, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 100, 'name': 'john', 'others': {'grade': 'AB'}}}},
{'company': {'employee': {'age': 100, 'name': '$', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 100, 'name': '$', 'others': {'grade': 'AB'}}}}
]

Calling generate_boundary_combinations(sample_dict) returns: 调用generate_boundary_combinations(sample_dict)返回:

[
{'company': {'employee': {'age': 1, 'name': 'aaaaaaaaaa', 'others': {'grade': 'C'}}}},
{'company': {'employee': {'age': 99, 'name': 'aaaaaaaaaa', 'others': {'grade': 'C'}}}}
]

Calling generate_explore_combinations(sample_dict) returns: 调用generate_explore_combinations(sample_dict)返回:

[
{'company': {'employee': {'age': 20, 'name': 'john', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 20, 'name': 'noname', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 20, 'name': 'noname', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 200, 'name': 'john', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 200, 'name': 'john', 'others': {'grade': 'star'}}}},
{'company': {'employee': {'age': 200, 'name': 'noname', 'others': {'grade': 'A'}}}},
{'company': {'employee': {'age': 200, 'name': 'noname', 'others': {'grade': 'star'}}}}
]

REVISED SOLUTION (To match revised problem) 修订的解决方案 (以匹配修订的问题)

import itertools
import random

def generate_combinations(thing, positive="valid", negative=None):

    """ Generate all possible combinations, walking and mimicking structure of "thing" """

    if isinstance(thing, dict):  # if dictionary, distinguish between two types of dictionary
        if positive in thing:
            if negative is None:
                return thing[positive]  # here it's OK if it's empty
            elif thing[positive]:  # here it's not OK if it's empty
                return [random.choice(thing[positive])] + thing[negative]
            else:
                return []
        else:
            results = []
            for key, value in thing.items():  # generate all possible key: value combinations
                results.append([(key, result) for result in generate_combinations(value, positive, negative)])
            return [dict(result) for result in itertools.product(*results)]

    elif isinstance(thing, (list, tuple)):  # added tuple just to be safe (thanks Padraic!)
        # generate recursive result sets for each element of list
        results = [generate_combinations(element, positive, negative) for element in thing]
        return [list(result) for result in itertools.product(*results)]

    else:  # not a type we know how to handle
        raise TypeError("Unexpected type")


def generate_boundary_combinations(thing):

    """ Generate all possible boundary combinations """

    valid = generate_combinations(thing)

    return [result for result in generate_combinations(thing, negative='boundary') if result not in valid]

generate_invalid_combinations() and generate_explore_combinations() are the same as before. generate_invalid_combinations()generate_explore_combinations()与以前相同。 Subtle differences: 细微差异:

Instead of grabbing the first item out of the valid array in a negative evaluation, it now grabs a random item from the valid array. 现在,它不再从否定评估中从有效数组中获取第一项,而是从有效数组中获取随机项。

Values for items like 'age': [30] come back as lists as that's how they were specified: 'age': [30]这样'age': [30]项目的值作为列表返回,因为它们是这样指定的:

'age': [{'boundary': [1, 99],
    'explore': [200],
    'negative': [-1, 100],
    'valid': [20, 30]}],

If you instead want 'age': 30 like the earlier output examples, then modify the definition accordingly: 如果您想要的是'age': 30就像前面的输出示例一样,请相应地修改定义:

'age': {'boundary': [1, 99],
    'explore': [200],
    'negative': [-1, 100],
    'valid': [20, 30]},

The boundary property is now treated like one of the 'negative' values. 现在将边界属性视为“负”值之一。

Just for reference, I don't plan to generate all the outputs this time: calling generate_combinations(sample_dict2) returns results like: 仅供参考,我这次不打算生成所有输出:调用generate_combinations(sample_dict2)返回如下结果:

[
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'B']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['A', 'B']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['john'], 'others': {'grade': ['A', 'A']}, 'age': [20]}}, {'employee': {'name': ['john'], 'others': {'grade': ['B', 'A']}, 'age': [20]}}]}},
...
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['A', 'B']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'A']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'A']}, 'age': [30]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [20]}}]}},
{'company': {'employee_list': [{'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}, {'employee': {'name': ['tom'], 'others': {'grade': ['B', 'B']}, 'age': [30]}}]}}
]

This is an open-ended hornet's nest of a question. 这是一个开放式大黄蜂的问题巢。

  1. Look at the whitepapers for Agitar other tools by Agitar to see if this what you are thinking about. 查看Agitar的Agitar其他工具的白皮书,看看您是否在考虑这一点。

  2. Look at Knuth's work on combinationals . 看一下Knuth关于组合的工作。 It's a tough read. 读起来很难。

  3. Consider just writing a recursive descent generator that uses 'yield '. 考虑只编写一个使用'yield'的递归下降生成器。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM