[英]merging two dictionaries of lists with the same keys in python
My problem: 我的问题:
I'm trying to merge two dictionaries of lists into a new dictionary, alternating the elements of the 2 original lists for each key to create the new list for that key. 我正在尝试将两个列表词典合并到一个新词典中,为每个键交替2个原始列表的元素,以创建该键的新列表。
So for example, if I have two dictionaries: 例如,如果我有两个词典:
strings = {'S1' : ["string0", "string1", "string2"], 'S2' : ["string0", "string1"]}
Ns = {'S1' : ["N0", "N1"], 'S2' : ["N0"]}
I want to merge these two dictionaries so that the final dictionary will look like: 我想合并这两个词典,以便最终字典看起来像:
strings_and_Ns = {'S1': ["string0", "N0", "string1", "N1", "string2"], 'S2': ["string0", "N0", "string1"]}
or better yet, have the strings from the list joined together for every key, like: 或者更好的是,让列表中的字符串为每个键连接在一起,例如:
strings_and_Ns = {'S1': ["string0N0string1N1string2"], 'S2': ["string0N0string1"]}
(I'm trying to connect together DNA sequence fragments.) (我正在尝试将DNA序列片段连接在一起。)
What I've tried so far: 到目前为止我尝试过的:
zip 压缩
for S in Ns:
newsequence = [zip(strings[S], Ns[S])]
newsequence_joined = ''.join(str(newsequence))
strings_and_Ns[species] = newsequence_joined
This does not join the sequences together into a single string, and the order of the strings are still incorrect. 这不会将序列连接成一个字符串,并且字符串的顺序仍然不正确。
Using a defaultdict 使用defaultdict
from collections import defaultdict
strings_and_Ns = defaultdict(list)
for S in (strings, Ns):
for key, value in S.iteritems():
strings_and_Ns[key].append(value)
The order of the strings for this is also incorrect... 这个字符串的顺序也是不正确的......
Somehow moving along the lists for each key... 以某种方式移动每个键的列表......
for S in strings:
list = strings[S]
L = len(list)
for i in range(L):
strings_and_Ns[S] = strings_and_Ns[S] + strings[S][i] + strings[S][i]
strings_and_Ns = {}
for k,v in strings.items():
pairs = zip(v, Ns[k] + ['']) # add empty to avoid need for zip_longest()
flat = (item for sub in pairs for item in sub)
strings_and_Ns[k] = ''.join(flat)
flat
is built according to the accepted answer here: Making a flat list out of list of lists in Python flat
是根据这里接受的答案构建的: 在Python中列出列表中的平面列表
You could do it with itertools or with list slicing stated here . 您可以使用itertools或此处所述的列表切片来完成此操作 。 The result looks pretty smart with itertools.
使用itertools,结果看起来很聪明。
strings_and_Ns = {}
for skey, sval in strings.iteritems():
iters = [iter(sval), iter(Ns[skey])]
strings_and_Ns[skey] = ["".join(it.next() for it in itertools.cycle(iters))]
You have to take care about the corresponding length of your lists. 您必须注意列表的相应长度。 If one iterator raise
StopIteration
the merging ends for that key. 如果一个迭代器引发
StopIteration
,则该键的合并结束。
To alternate x
, y
iterables inserting default
for missing values: 要交替
x
, y
迭代插入default
值的default
值:
from itertools import izip_longest
def alternate(x, y, default):
return (item for pair in izip_longest(x, y, default) for item in pair)
a = {'S1' : ["string0", "string1", "string2"], 'S2' : ["string0", "string1"]}
b = {'S1' : ["N0", "N1"], 'S2' : ["N0"]}
assert a.keys() == b.keys()
merged = {k: ''.join(alternate(a[k], b[k], '')) for k in a}
print(merged)
{'S2': 'string0N0string1', 'S1': 'string0N0string1N1string2'}
itertools.izip_longest will take care of the uneven length lists, then just use str.join
to join into one single string. itertools.izip_longest将处理不均匀的长度列表,然后使用
str.join
连接成一个单独的字符串。
strings = {'S1' : ["string0", "string1", "string2"], 'S2' : ["string0", "string1"]}
Ns = {'S1' : ["N0", "N1"], 'S2' : ["N0"]}
from itertools import izip_longest as iz
strings_and_Ns = {k:["".join([a+b for a, b in iz(strings[k],v,fillvalue="")])] for k,v in Ns.items()}
print(strings_and_Ns)
{'S2': ['string0N0string1'], 'S1': ['string0N0string1N1string2']}
Which is the same as: 这与以下相同:
strings_and_Ns = {}
for k, v in Ns.items():
strings_and_Ns[k] = ["".join([a + b for a, b in iz(strings[k], v, fillvalue="")])]
Using izip_longest
means the code will work no matter which dict's values contain more elements. 使用
izip_longest
意味着无论哪个dict的值包含更多元素,代码都将起作用。
Similar to the other solutions posted, but I would move some of it off into a function 与发布的其他解决方案类似,但我会将其中的一部分移到一个函数中
import itertools
def alternate(*iters, **kwargs):
return itertools.chain(*itertools.izip_longest(*iters, **kwargs))
result = {k: ''.join(alternate(strings[k], Ns[k] + [''])) for k in Ns}
print result
Gives: 得到:
{'S2': 'string0N0string1', 'S1': 'string0N0string1N1string2'}
The alternate
function is from https://stackoverflow.com/a/2017923/66349 . alternate
功能来自https://stackoverflow.com/a/2017923/66349 。 It takes iterables as arguments and chains together items from each one successively (using izip_longest
as Padraic Cunningham did). 它将iterables作为参数并连续地将每个项链接在一起(使用
izip_longest
作为Padraic Cunningham所做的)。
You can either specify fillvalue=''
to handle the different length lists, or just manually pad out the shorter list as I have done above (which assumes Ns
will always be one shorter than strings
). 您可以指定
fillvalue=''
来处理不同的长度列表,或者只是手动填充较短的列表,如上所述(假设Ns
总是比strings
短一个)。
If you have an older python version that doesn't support dict comprehension, you could use this instead 如果你有一个不支持dict理解的旧python版本,你可以使用它
result = dict((k, ''.join(alternate(strings[k], Ns[k] + ['']))) for k in Ns)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.