[英]Group the elements in a list of lists into a nested dictionary
I have the following sample list of lists (only a section is shown): 我有以下列表示例列表(仅显示一部分):
[
["4YBB|1|AA|A|262", "4YBB|1|AA|A|263", "s35"],
["4YBB|1|AA|U|261", "4YBB|1|AA|A|263", "tSH",],
["4YBB|1|AA|U|261", "4YBB|1|AA|C|264", "ntSH", "s55"],
["4YBB|1|AA|G|259", "4YBB|1|AA|C|267", "cWW"],
["4WOI|1|DA|A|262", "4WOI|1|DA|A|263", "s35", "cWW"],
["4WOI|1|DA|C|264", "4WOI|1|DA|G|265", "s35"]
....
]
I would like to group the elements in this list into a nested dictionary based on the following list of keys: 我想根据以下键列表将此列表中的元素分组为嵌套字典:
outer_key = ["4YBB|1|AA", "4WOI|1|DA"]
inner_key = [(259, 267), (259, 260), (260, 261), (260, 265), (260, 267), (261, 263), (261, 264), (262, 263), (264, 265), (265, 267)]
As you can notice, the outer key represents a subset of the element at index [0] and index [1] of the inner list whereas the inner key tuple values represent the last element of the index [0] and index [1] of the inner list when split using the '|' 如您所见,外键表示内部列表的索引[0]和索引[1]的元素的子集,而内键元组值表示索引的[0]和索引[1]的最后一个元素使用“ |”分割时的内部列表 character.
字符。 The inner key tuples represent all the possible combination of positions (x,y) that might have an 'interaction' (index[2] onwards of the inner list).
内部键元组表示位置(x,y)的所有可能组合,这些位置可能具有“交互作用”(内部列表的index [2]开始)。 As such, not all keys will have a value associated with it.
因此,并非所有键都有与其关联的值。 If a particular inner tuple key is not present, append the "-" to its value.
如果不存在特定的内部元组键,则在其值后附加“-”。
pw_info = {
"4YBB|1|AA" : {
(259, 267): "cWW",
(259, 260): "-",
(260, 261): "-",
(260, 265): "-",
(260, 267): "-",
(261, 263): "tSH",
(261, 264): "ntSH;s55",
(262, 263): "s35",
(264, 265): "-",
(265, 267): "s35"
},
"4WOI|1|DA" : {
(259, 267): "-",
(259, 260): "-",
(260, 261): "-",
(260, 265): "-",
(260, 267): "-",
(261, 263): "-",
(261, 264): "-",
(262, 263): "s35;cWW",
(264, 265): "s35",
(265, 267): "-"
}
}
The keys must be ordered according to the outer and the inner key lists. 必须根据外部和内部密钥列表对密钥进行排序。 Also, it is possible for the inner list to have more than 3 elements.
同样,内部列表可能包含3个以上的元素。 If there are more than 3 elements, concatenate element at index [2] and higher together using ";"
如果元素多于3个,则使用“;”将索引[2]和更高位置的元素连接在一起。 as the inner dictionary value (for example: (261, 264): "ntSH;s55" ).
作为内部字典值(例如:(261,264):“ ntSH; s55”)。 What is the best way to do this?
做这个的最好方式是什么?
As for " The keys must be ordered according to the outer and the inner key lists " - keep in mind that dictionaries are unordered data structures. 至于“ 键必须根据外部和内部键列表进行排序”-请记住,字典是无序的数据结构。
OrderedDict
object is an alternative. OrderedDict
对象是替代方法。
from collections import OrderedDict
import pprint
input_list = [
["4YBB|1|AA|A|262", "4YBB|1|AA|A|263", "s35"],
["4YBB|1|AA|U|261", "4YBB|1|AA|A|263", "tSH", ],
["4YBB|1|AA|U|261", "4YBB|1|AA|C|264", "ntSH", "s55"],
["4YBB|1|AA|G|259", "4YBB|1|AA|C|267", "cWW"],
["4WOI|1|DA|A|262", "4WOI|1|DA|A|263", "s35", "cWW"],
["4WOI|1|DA|C|264", "4WOI|1|DA|G|265", "s35"]
]
outer_keys = ["4YBB|1|AA", "4WOI|1|DA"]
inner_keys = [(259, 267), (259, 260), (260, 261), (260, 265), (260, 267),
(261, 263), (261, 264), (262, 263), (264, 265), (265, 267)]
# prepopulated dict indexed by `outer_keys` and
# containing OrderedDicts with default values for `inner_keys`
pw_info = {k: OrderedDict({t: '-' for t in inner_keys}) for k in outer_keys}
for sub_lst in input_list:
# extract starting slice from first 2 items (like `4YBB|1|AA`)
k0, k1 = sub_lst[0][:9], sub_lst[1][:9]
# check if 2 slices are equal and contained in `pw_info` dict (i.e. `outer_keys`)
if k0 == k1 and k0 in pw_info:
v1, v2 = sub_lst[0], sub_lst[1]
# `sub_key` is aimed to be a key for inner dict of the predefined `pw_info` dict
# thus it's composed as a tuple of trailing numbers of the first 2 items
# in sub_list (ex. `(262, 263)`)
sub_key = (int(v1[v1.rfind('|')+1:]), int(v2[v2.rfind('|')+1:]))
pw_info[k0][sub_key] = sub_lst[2] if len(sub_lst) == 3 else ';'.join(sub_lst[2:])
pprint.pprint(pw_info)
The output: 输出:
{'4WOI|1|DA': OrderedDict([((259, 267), '-'),
((259, 260), '-'),
((260, 261), '-'),
((260, 265), '-'),
((260, 267), '-'),
((261, 263), '-'),
((261, 264), '-'),
((262, 263), 's35;cWW'),
((264, 265), 's35'),
((265, 267), '-')]),
'4YBB|1|AA': OrderedDict([((259, 267), 'cWW'),
((259, 260), '-'),
((260, 261), '-'),
((260, 265), '-'),
((260, 267), '-'),
((261, 263), 'tSH'),
((261, 264), 'ntSH;s55'),
((262, 263), 's35'),
((264, 265), '-'),
((265, 267), '-')])}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.