按子级对dict排序的10k数据集排序

Question

I have large data collection 10k objects. 我有大量的数据收集10k对象。 I want to sort it to the dict on the following way. 我想按以下方式将其排序为字典。

{
'code': obj.code, 'childs':[{
    'code': obj.code, 'childs':[{
        'code':obj.code, 'code':obj.code}] # no childs here
    }]
}

obj.code is 8 character string written as number obj.code是写为数字的8个字符串

'01000000',
'01100000',
'01200000',
'21000000',
'21121200',

First two characters with 6 zero are 'root' parents so '01000000' and '21000000' are root parents. 前6个零的两个字符是“根”父母，因此“ 01000000”和“ 21000000”是根父母。 Then '01100000', and '01400000' are first level child's of '01' parent. 然后，“ 01100000”和“ 01400000”是“ 01”父级的第一级子级。 Every parent can have 9 child's max. 每个父母最多可容纳9个孩子。 So tree looks like this 所以树看起来像这样

I'm not sure from where to start, so any hint is much appreciated. 我不确定从哪里开始，所以任何提示都将不胜感激。 Root parents can be found on this way. 可以通过这种方式找到根父母。

def mySort(myQuerySet):
     root_parents = myQuerySet.objects(code__icontain='000000')

Answer 1

Here's one possible solution. 这是一种可能的解决方案。 It iterates through the codes once to build a simple tree, then afterward turns that tree into the kind that you requested. 它遍历代码一次以构建一棵简单的树，然后将其变成您所请求的树。

import re
from pprint import pprint
from collections import defaultdict

def build_tree(codes):
    """Build the tree from a list of codes (strings)"""

    # tree is a dictionary that maps each code to a list of codes of children.
    tree = defaultdict(list)
    roots = []
    for code in codes:
        if '000000' in code:
            tree[code] = []
            roots.append(code)
        else:
            nonzero = re.search(r'[1-9]0*$', code).start()
            parent = code[:nonzero] + '0' + code[1 + nonzero:]
            tree[parent].append(code)

    # sort children (optional)
    for v in tree.values():
        v.sort()

    # convert original dictionary to one in the desired form.
    def convert(old_parent):
        result = {}
        result['code'] = old_parent
        if len(tree[old_parent]) > 0:
            result['children'] = [convert(c) for c in tree[old_parent]]
        return result

    return [convert(root) for root in roots]

codes = ["01000000", "01100000", "01110000", "01111000", "01111100", "01111110",
         "01111111", "01111112", "01111113", "01111114", "01111115", "01111120",
         "01111121", "01111122", "01111123", "01111124", "01111130", "01111131",
         "01111132", "01111133", "01111134", "01111140", "01111141", "01111142",
         "01111143", "01111144"]

pprint(build_tree(codes))

Here is the output (excuse the formatting) 这是输出（请格式化）

[{'children': [{'children': [{'children': [{'children': [{'children': [{'children': [{'code': '01111111'},
                                                                                     {'code': '01111112'},
                                                                                     {'code': '01111113'},
                                                                                     {'code': '01111114'},
                                                                                     {'code': '01111115'}],
                                                                        'code': '01111110'},
                                                                       {'children': [{'code': '01111121'},
                                                                                     {'code': '01111122'},
                                                                                     {'code': '01111123'},
                                                                                     {'code': '01111124'}],
                                                                        'code': '01111120'},
                                                                       {'children': [{'code': '01111131'},
                                                                                     {'code': '01111132'},
                                                                                     {'code': '01111133'},
                                                                                     {'code': '01111134'}],
                                                                        'code': '01111130'},
                                                                       {'children': [{'code': '01111141'},
                                                                                     {'code': '01111142'},
                                                                                     {'code': '01111143'},
                                                                                     {'code': '01111144'}],
                                                                        'code': '01111140'}],
                                                          'code': '01111100'}],
                                            'code': '01111000'}],
                              'code': '01110000'}],
                'code': '01100000'}],
  'code': '01000000'}]

按子级对dict排序的10k数据集排序

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-02-01 03:26:38

按子级对dict排序的10k数据集排序

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-02-01 03:26:38

解决方案1
1 已采纳 2016-02-01 03:26:38