简体   繁体   English

Python根据给定的键合并2个列表

[英]Python combine 2 lists based on given key

I am dealing with a problem, here is the input 我正在处理一个问题,这是输入

list1 = ['A', 'U', 'C', 'C', 'A']
list2 = ['12', '14']
key = {'A12':'*', 'C14':'#'}

the output like this: 输出是这样的:

output1 = [['A12', 'U', 'C14', 'C', 'A'], ['A12', 'U', 'C', 'C14', 'A'],['A', 'U', 'C14', 'C', 'A12'], ['A', 'U', 'C', 'C14', 'A12']]

and convert to 并转换为

output2 = [['*', 'U', '#', 'C', 'A'], ['*', 'U', 'C', '#', 'A'],['A', 'U', '#', 'C', '*'], ['A', 'U', 'C', '#', '*']]

I am using Python2.7 to solve this problem, but I have not figured it out yet... Any answer or suggestion will be appreciated! 我正在使用Python2.7解决此问题,但我还没有弄清楚...任何答案或建议将不胜感激!

Here is my code: 这是我的代码:

list1 = ['A', 'U', 'C', 'C', 'A']
list2 = ['12', '14']
key = {'A12':'*', 'C14':'#'}

list3 = ['12', '14', '0', '0','0'] #build by myself
combo = list(set(itertools.permutations(list3, len(list3))))
list_combo = []
for each_list in combo:
new_list = []
for i in xrange(len(list1)):
    if list1[i]+each_list[i] in key:
        new_list.append(key[list1[i]+each_list[i]])
    else:
        new_list.append(list1[i])
    list_combo.append(new_list)
print list_combo

There are some extra lists in the output, and if list2 or list3 is too big, it will take a lot of time to run itertools.permutations, so I am seeking another way to solve this problem. 输出中还有一些额外的列表,如果list2或list3太大,则运行itertools.permutations将花费大量时间,因此我正在寻找另一种方法来解决此问题。

Ok this is a little long, so bear with me. 好的,这有点长,请耐心等待。 The first step is constructing a dictionary to convert from letters to letters with numbers, ie A to A12 , etc. 第一步是构建字典,以将字母转换为带数字的字母,即AA12等。

replacements = dict((k[0],k) for k in key.keys())
# replacements is equal to {'A': 'A12', 'C': 'C14'}

This makes things much simpler later. 这使事情以后变得简单得多。 The next step is building a list of all the indicies that need replacing and breaking those into sublists for each type of replacement. 下一步是建立所有需要替换的索引的列表,并将其分为每种替换类型的子列表。

indicies = [[i for i,x in enumerate(list1) if x == k] for k in replacements.keys()]
# indicies is equal to [[0, 4], [2, 3]]

Finally, we use itertools.product on the list of indices that need replacing to get each possible grouping, and then add them to the necessary output lists: 最后,我们在需要替换的索引列表上使用itertools.product以获得每个可能的分组,然后将它们添加到必要的输出列表中:

output1 = []
output2 = []
for group in itertools.product(*indicies):
    l = []
    l2 = []
    for i in range(len(list1)):
        l.append(list1[i] if i not in group else replacements[list1[i]])
        l2.append(list1[i] if i not in group else key[replacements[list1[i]]])
    output1.append(l)
    output2.append(l2)
print output1
print output2

This gives us our desired answers of: 这为我们提供了以下理想的答案:

[['A12', 'U', 'C14', 'C', 'A'], ['A12', 'U', 'C', 'C14', 'A'], ['A', 'U', 'C14', 'C', 'A12'], ['A', 'U', 'C', 'C14', 'A12']]
[['*', 'U', '#', 'C', 'A'], ['*', 'U', 'C', '#', 'A'], ['A', 'U', '#', 'C', '*'], ['A', 'U', 'C', '#', '*']]

One major difference between this code and the code that you are running is that my code only runs for exactly as many iterations as it needs to, so in the case of your sample data set, 4 times. 此代码与您正在运行的代码之间的主要区别是,我的代码仅运行与所需次数完全相同的迭代,因此,对于示例数据集,运行4次。 Your code seems like it runs at least at least certain parts n! 您的代码似乎至少运行了某些部分n! times, where n is the size of list1 , and is equal to 120 for your sample data set. 次,其中nlist1的大小,并且对于您的样本数据集等于120。 This is still likely to run for a while on extremely large data sets (as is the nature of this sort of problem), but will only scale up with the number of replacements it has to do, rather than the size of the data set overall. 这仍然可能会在非常大的数据集上运行一段时间(这是此类问题的本质),但只会随着它必须执行的替换次数而扩大,而不是整个数据集的规模。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM