[英]Slice a list of tuple into list of lists if first and another elements are the same Python
I have the following list A
including tuples, and I would like to slice A
into a list of lists as seen in B
.我有以下列表
A
包括元组,我想将A
切成列表列表,如B
中所示。 The logic is that if the first and the fourth elements of the tuples are repeating, pack the group as a list inside list A
.逻辑是,如果元组的第一个和第四个元素重复,则将该组打包为列表
A
中的列表。
A = [(1, 'C-30219', 'C-30060', 'C-6235d935d39c258876476e35a7acfd69-1-1', 2),
(1, 'C-30060', 'C-30022', 'C-6235d935d39c258876476e35a7acfd69-1-1', 3),
(1, 'C-30022', 'C-30205', 'C-6235d935d39c258876476e35a7acfd69-1-1', 4),
(3, 'C-30248', 'C-30260', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 4),
(3, 'C-30260', 'C-30108', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 5),
(3, 'C-30108', 'C-30240', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 6),
(5, 'C-30269', 'C-30285', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 9),
(5, 'C-30285', 'C-30109', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 10),
(5, 'C-30109', 'C-30211', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 11),
(5, 'C-30211', 'C-30289', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 12),
(5, 'C-30072', 'C-30375', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 15),
(5, 'C-30375', 'C-30095', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 16)]
Out:出去:
B = [[(1, 'C-30219', 'C-30060', 'C-6235d935d39c258876476e35a7acfd69-1-1', 2),
(1, 'C-30060', 'C-30022', 'C-6235d935d39c258876476e35a7acfd69-1-1', 3),
(1, 'C-30022', 'C-30205', 'C-6235d935d39c258876476e35a7acfd69-1-1', 4)],
[(3, 'C-30248', 'C-30260', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 4),
(3, 'C-30260', 'C-30108', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 5),
(3, 'C-30108', 'C-30240', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 6)],
[(5, 'C-30269', 'C-30285', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 9),
(5, 'C-30285', 'C-30109', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 10),
(5, 'C-30109', 'C-30211', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 11),
(5, 'C-30211', 'C-30289', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 12)],
[(5, 'C-30072', 'C-30375', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 15),
(5, 'C-30375', 'C-30095', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 16)]]
Here is my attempt, which gives the desired output after a lot of mumbo jumbo.这是我的尝试,经过大量的胡言乱语,它给出了所需的 output。 I am seeking a more efficient and Pythonic way to achieve this.
我正在寻找一种更有效和 Pythonic 的方式来实现这一点。
inter = list(set([(i[0],i[3]) for i in A]))
B = {o_t: [] for o_t in inter}
for i in range(1, len(A)):
if (A[i][0] == A[i-1][0]
and A[i][3] == A[i-1][3]):
B[A[i][0],A[i][3]].append(A[i])
B[A[i][0],A[i][3]].append(A[i-1])
B = {key: sorted(list(set(B[key])), key = lambda x: x[-1]) for key in B.keys()}
list(B.values())
Perfect task for groupby
from itertools
来自
itertools
的groupby
的完美任务
from itertools import groupby
A = [(1, 'C-30219', 'C-30060', 'C-6235d935d39c258876476e35a7acfd69-1-1', 2),
(1, 'C-30060', 'C-30022', 'C-6235d935d39c258876476e35a7acfd69-1-1', 3),
(1, 'C-30022', 'C-30205', 'C-6235d935d39c258876476e35a7acfd69-1-1', 4),
(3, 'C-30248', 'C-30260', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 4),
(3, 'C-30260', 'C-30108', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 5),
(3, 'C-30108', 'C-30240', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 6),
(5, 'C-30269', 'C-30285', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 9),
(5, 'C-30285', 'C-30109', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 10),
(5, 'C-30109', 'C-30211', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 11),
(5, 'C-30211', 'C-30289', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 12),
(5, 'C-30072', 'C-30375', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 15),
(5, 'C-30375', 'C-30095', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 16)]
B = [list(g) for _,g in groupby(A, key=lambda x: (x[0], x[3]))]
print(B)
output output
[[(1, 'C-30219', 'C-30060', 'C-6235d935d39c258876476e35a7acfd69-1-1', 2),
(1, 'C-30060', 'C-30022', 'C-6235d935d39c258876476e35a7acfd69-1-1', 3),
(1, 'C-30022', 'C-30205', 'C-6235d935d39c258876476e35a7acfd69-1-1', 4)],
[(3, 'C-30248', 'C-30260', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 4),
(3, 'C-30260', 'C-30108', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 5),
(3, 'C-30108', 'C-30240', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 6)],
[(5, 'C-30269', 'C-30285', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 9),
(5, 'C-30285', 'C-30109', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 10),
(5, 'C-30109', 'C-30211', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 11),
(5, 'C-30211', 'C-30289', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 12)],
[(5, 'C-30072', 'C-30375', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 15),
(5, 'C-30375', 'C-30095', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 16)]]
NOTE : I am assuming that A is sorted by 1st and fourth element.注意:我假设 A 按第一个和第四个元素排序。 groupby will group a list
[1,1,1,2,2,1,3,3]
to [(1,1,1), (2,2), (1), (3,3)]
. groupby 将列表
[1,1,1,2,2,1,3,3]
分组到[(1,1,1), (2,2), (1), (3,3)]
。 It won't group all the 1's
它不会对所有的
1's
进行分组
Below以下
from collections import defaultdict
data = defaultdict(list)
A = [(1, 'C-30219', 'C-30060', 'C-6235d935d39c258876476e35a7acfd69-1-1', 2),
(1, 'C-30060', 'C-30022', 'C-6235d935d39c258876476e35a7acfd69-1-1', 3),
(1, 'C-30022', 'C-30205', 'C-6235d935d39c258876476e35a7acfd69-1-1', 4),
(3, 'C-30248', 'C-30260', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 4),
(3, 'C-30260', 'C-30108', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 5),
(3, 'C-30108', 'C-30240', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 6),
(5, 'C-30269', 'C-30285', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 9),
(5, 'C-30285', 'C-30109', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 10),
(5, 'C-30109', 'C-30211', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 11),
(5, 'C-30211', 'C-30289', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 12),
(5, 'C-30072', 'C-30375', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 15),
(5, 'C-30375', 'C-30095', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 16)]
for a in A:
data[(a[0], a[3])].append(a)
B = [v for v in data.values()]
for b in B:
print(b)
output output
[(1, 'C-30219', 'C-30060', 'C-6235d935d39c258876476e35a7acfd69-1-1', 2), (1, 'C-30060', 'C-30022', 'C-6235d935d39c258876476e35a7acfd69-1-1', 3), (1, 'C-30022', 'C-30205', 'C-6235d935d39c258876476e35a7acfd69-1-1', 4)]
[(3, 'C-30248', 'C-30260', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 4), (3, 'C-30260', 'C-30108', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 5), (3, 'C-30108', 'C-30240', 'C-ac19d0edcf4d4ebe071e8d43be1901e2-1-1', 6)]
[(5, 'C-30269', 'C-30285', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 9), (5, 'C-30285', 'C-30109', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 10), (5, 'C-30109', 'C-30211', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 11), (5, 'C-30211', 'C-30289', 'C-d0d36bb9f2a7e248638cff9a04065977-1-1', 12)]
[(5, 'C-30072', 'C-30375', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 15), (5, 'C-30375', 'C-30095', 'C-710c460e8dfc2b3a523e077b6c6bdb40-1-1', 16)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.