[英]Sorting 10k data set to the dict by child basis
I have large data collection 10k objects. 我有大量的数据收集10k对象。 I want to sort it to the dict on the following way.
我想按以下方式将其排序为字典。
{
'code': obj.code, 'childs':[{
'code': obj.code, 'childs':[{
'code':obj.code, 'code':obj.code}] # no childs here
}]
}
obj.code is 8 character string written as number obj.code是写为数字的8个字符串
'01000000',
'01100000',
'01200000',
'21000000',
'21121200',
First two characters with 6 zero are 'root' parents so '01000000' and '21000000' are root parents. 前6个零的两个字符是“根”父母,因此“ 01000000”和“ 21000000”是根父母。 Then '01100000', and '01400000' are first level child's of '01' parent.
然后,“ 01100000”和“ 01400000”是“ 01”父级的第一级子级。 Every parent can have 9 child's max.
每个父母最多可容纳9个孩子。 So tree looks like this
所以树看起来像这样
01000000
01100000
01110000
01111000
01111100
01111110
01111111
01111112
01111113
01111114
01111115
01111120
01111121
01111122
01111123
01111124
01111130
01111131
01111132
01111133
01111134
01111140
01111141
01111142
01111143
01111144
I'm not sure from where to start, so any hint is much appreciated. 我不确定从哪里开始,所以任何提示都将不胜感激。 Root parents can be found on this way.
可以通过这种方式找到根父母。
def mySort(myQuerySet):
root_parents = myQuerySet.objects(code__icontain='000000')
Here's one possible solution. 这是一种可能的解决方案。 It iterates through the codes once to build a simple tree, then afterward turns that tree into the kind that you requested.
它遍历代码一次以构建一棵简单的树,然后将其变成您所请求的树。
import re
from pprint import pprint
from collections import defaultdict
def build_tree(codes):
"""Build the tree from a list of codes (strings)"""
# tree is a dictionary that maps each code to a list of codes of children.
tree = defaultdict(list)
roots = []
for code in codes:
if '000000' in code:
tree[code] = []
roots.append(code)
else:
nonzero = re.search(r'[1-9]0*$', code).start()
parent = code[:nonzero] + '0' + code[1 + nonzero:]
tree[parent].append(code)
# sort children (optional)
for v in tree.values():
v.sort()
# convert original dictionary to one in the desired form.
def convert(old_parent):
result = {}
result['code'] = old_parent
if len(tree[old_parent]) > 0:
result['children'] = [convert(c) for c in tree[old_parent]]
return result
return [convert(root) for root in roots]
codes = ["01000000", "01100000", "01110000", "01111000", "01111100", "01111110",
"01111111", "01111112", "01111113", "01111114", "01111115", "01111120",
"01111121", "01111122", "01111123", "01111124", "01111130", "01111131",
"01111132", "01111133", "01111134", "01111140", "01111141", "01111142",
"01111143", "01111144"]
pprint(build_tree(codes))
Here is the output (excuse the formatting) 这是输出(请格式化)
[{'children': [{'children': [{'children': [{'children': [{'children': [{'children': [{'code': '01111111'},
{'code': '01111112'},
{'code': '01111113'},
{'code': '01111114'},
{'code': '01111115'}],
'code': '01111110'},
{'children': [{'code': '01111121'},
{'code': '01111122'},
{'code': '01111123'},
{'code': '01111124'}],
'code': '01111120'},
{'children': [{'code': '01111131'},
{'code': '01111132'},
{'code': '01111133'},
{'code': '01111134'}],
'code': '01111130'},
{'children': [{'code': '01111141'},
{'code': '01111142'},
{'code': '01111143'},
{'code': '01111144'}],
'code': '01111140'}],
'code': '01111100'}],
'code': '01111000'}],
'code': '01110000'}],
'code': '01100000'}],
'code': '01000000'}]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.