繁体   English   中英

Python-唯一序列列表

[英]Python - List of unique sequences

我有一本字典,其中的元素作为某些顺序的列表:

a = {'seq1':['5', '4', '3', '2', '1', '6', '7', '8', '9'], 
     'seq2':['9', '8', '7', '6', '5', '4', '3', '2', '1'],
     'seq3':['5', '4', '3', '2', '1', '11', '12', '13', '14'],
     'seq4':['15', '16', '17'],
     'seq5':['18', '19', '20', '21', '22', '23'],
     'seq6':['18', '19', '20', '24', '25', '26']}

所以有6个序列

我需要做的是:

  • 仅查找唯一列表(如果两个列表包含相同的元素(无论顺序如何),它们都不唯一)-说我需要摆脱第二个列表(第一个建立的唯一列表将保留)
  • 在唯一列表中,我需要找到元素的唯一子序列并进行打印

唯一序列的界线是通过元素顺序的相似性找到的-在第1和第3列中恰好在元素'1'之后的界线末端,因此我们得到了子序列['5','4','3','2' ,'1']

结果,我希望看到的元素与开始时的顺序完全相同(如果有可能的话)。 所以我期望这样:

[['5', '4', '3', '2', '1']['6', '7', '8', '9']['11', '12', '13', '14']['15', '16', '17']['18', '19', '20']['21', '22', '23']['24', '25', '26']]

试图这样做:

import itertools

unique_sets = []

a = {'seq1':["5","4","3","2","1","6","7","8","9"], 'seq2':["9","8","7","6","5","4","3","2","1"], 'seq3':["5","4","3","2","1","11","12","13","14"], 'seq4':["15","16","17"], 'seq5':["18","19","20","21","22","23"], 'seq6':["18","19","20","24","25","26"]}

b = []

for seq in a.values():
    b.append(seq)

for seq1, seq2 in itertools.combinations(b,2):                                     #searching for intersections 
    if set(seq1).intersection(set(seq2)) not in unique_sets:
        #if set(seq1).intersection(set(seq2)) == set(seq1):
            #continue
        unique_sets.append(set(seq1).intersection(set(seq2)))
    if set(seq1).difference(set(seq2)) not in unique_sets:
        unique_sets.append(set(seq1).difference(set(seq2)))

for it in unique_sets:
    print(it)

我得到的这个与我的期望有些不同:

{'9', '5', '2', '3', '7', '1', '4', '8', '6'}
set()
{'5', '2', '3', '1', '4'}
{'9', '8', '6', '7'}
{'5', '2', '14', '3', '1', '11', '12', '4', '13'}
{'17', '16', '15'}
{'19', '20', '18'}
{'23', '21', '22'}

如果上面的代码中没有注释,结果将更糟。

另外,我遇​​到了集合中无序元素的问题,这是我得到的结果。 尝试使用两个单独的列表执行此操作:

seq1 = set([1,2,3,4,5,6,7,8,9])
seq2 = set([1,2,3,4,5,10,11,12])

而且效果很好-元素从未改变它们在集合中的位置。 我的错误在哪里?

谢谢。

更新:好的,现在我有一个更复杂的任务,在这里提供的算法将无法正常工作

我有这本字典:

precond = {

'seq1':     ["1","2"],
'seq2':     ["3","4","2"],
'seq3':     ["5","4","2"],
'seq4':     ["6","7","4","2"],
'seq5':     ["6","4","7","2"],
'seq6':     ["6","1","8","9","10"],
'seq7':     ["6","1","8","11","9","12","13","14"],
'seq8':     ["6","1","8","11","4","15","13"],
'seq9':     ["6","1","8","16","9","11","4","17","18","2"],
'seq10':    ["6","1","8","19","9","4","16","2"],
}

我希望这些序列至少包含2个元素:

[1, 2], 
[4, 2], 
[6, 7], 
[6, 4, 7, 2], 
[6, 1, 8] 
[9,10], 
[6,1,8,11]
[9,12,13,14]
[4,15,13]
[16,9,11,4,17,18,2]
[19,9,4,16,2]

现在我写了这段代码:

precond = {

    'seq1':     ["1","2"],
    'seq2':     ["3","4","2"],
    'seq3':     ["5","4","2"],
    'seq4':     ["6","7","4","2"],
    'seq5':     ["6","4","7","2"],
    'seq6':     ["6","1","8","9","10"],
    'seq7':     ["6","1","8","11","9","12","13","14"],
    'seq8':     ["6","1","8","11","4","15","13"],
    'seq9':     ["6","1","8","16","9","11","4","17","18","2"],
    'seq10':    ["6","1","8","19","9","4","16","2"],
}

seq_list = []
result_seq = []
#d = []

for seq in precond.values():
    seq_list.append(seq)

#print(b)

contseq_ind = 0
control_seq = seq_list[contseq_ind]
mainseq_ind = 1
el_ind = 0
#index2 = 0

def compar():
    if control_seq[contseq_ind] != seq_list[mainseq_ind][el_ind]:
        mainseq_ind += 1
        compar()
    else:
        result_seq.append(control_seq[contseq_ind])
        contseq_ind += 1
        el_ind += 1

        if contseq_ind > len(control_seq):
            control_seq = seq_list[contseq_ind + 1]
            compar()
        else:
            compar()


compar()

无论如何,此代码都不完整-我创建时从一开始就寻找相同的元素,因此我仍然需要编写一个代码来在两个比较元素的末尾搜索序列。

现在,我对递归有问题。 在第一次递归调用后,我有这个错误:

if control_seq[contseq_ind] != b[mainseq_ind][el_ind]:
UnboundLocalError: local variable 'control_seq' referenced before assignment

我怎样才能解决这个问题? 或者,也许您有一个比使用递归更好的主意? 先感谢您。

不知道这是否是您想要的,但是会得到相同的结果:

from collections import OrderedDict

a = {'seq1':["5","4","3","2","1","6","7","8","9"],
     'seq2':["9","8","7","6","5","4","3","2","1"],
     'seq3':["5","4","3","2","1","11","12","13","14"],
     'seq4':["15","16","17"],
     'seq5':["18","19","20","21","22","23"],
     'seq6':["18","19","20","24","25","26"]}

level = 0
counts = OrderedDict()
# go through each value in the list of values to count the number
# of times it is used and indicate which list it belongs to
for elements in a.values():
    for element in elements:
        if element in counts:
            a,b = counts[element]
            counts[element] = a,b+1
        else:
            counts[element] = (level,1)
    level+=1

last = 0
result = []
# now break up the dictionary of unique values into lists according 
# to the count of each value and the level that they existed in 
for k,v in counts.items():
    if v == last:
        result[-1].append(k)
    else:
        result.append([k])
    last = v

print(result)

结果:

[['5', '4', '3', '2', '1'], 
 ['6', '7', '8', '9'], 
 ['11', '12', '13', '14'], 
 ['15', '16', '17'], 
 ['18', '19', '20'], 
 ['21', '22', '23'], 
 ['24', '25', '26']]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM