I need to group sublists with the same elements together and all element in sublists have no element in list2
and have a element in list3
For example:
list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[13,15]]
list2 = [7,8]
list3 = [1,6,11,13]
I would link [4,5]
and [1,4]
together since they both contain 1 same number and those two would combine into [1,4,5]
and they contain 1
in list3
and not contain 7
in list2
So after linking, the new list should be like:
new_list1 = [[1,4,5],[6],[11],[13,15]]
IE: there shouldn't be same number inside a sub-list and order is not important.
A longer example:
list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[6,8],[20,2],[20,3],[11,14],[14,16],[13,15]]
list2 = [7,8,20]
list3 = [1,2,3,16,15]
after linking, it would be
new_list = [[1,4,5],[2,3],[11,14,16],[13,15]]
How can this be done in a general way?
EDIT The final algorithm should comprise of the following three basic steps:
list1
that are contained in list2
list1
that have common elements list1
that do not contain any elements of list3
Here's my take on it, if Thomas Kühn managed to properly read your mind:
def subgroup_join(data, exclude, include):
exclude = set(exclude) # turn into set for faster lookup/compare
include = set(include) # turn into set for faster lookup/compare
data = [set(element) - exclude for element in data] # remove excluded elements
results = [set()] # init with an empty set
for element in data: # loop through our remaining elements
groups = [] # store elements / current results filtered by exclude list
ignore_element = False # flag if we should add the element as a standalone
for result in results: # go through each subgroup in the results
if element & result: # if the current element has common items with the result
result |= element # ... concatenate both into a subgroup
ignore_element = True
groups.append(result) # add the current result subgroup
if not ignore_element: # add only if the element wasn't concatenated
groups.append(element) # add the current element
results = groups # our element store becomes our new results set
return sorted([sorted(res) for res in results if result & include]) # sort & return
As for tests:
list1 = [[1, 4], [4, 5], [5, 7], [6, 7], [7, 8], [9, 7], [10, 9], [8, 10], [8, 11], [8, 13], [13, 15]]
list2 = [7, 8]
list3 = [1, 6, 11, 13]
print(subgroup_join(list1, list2, list3))
# prints: [[1, 4, 5], [6], [11], [13, 15]]
list1 = [[1, 4], [4, 5], [5, 7], [6, 7], [9, 7], [10, 9], [8, 10], [8, 11], [8, 13], [6, 8], [20, 2], [20, 3], [11, 14], [14, 16], [13, 15]]
list2 = [7, 8, 20]
list3 = [1, 2, 3, 16, 15]
print(subgroup_join(list1, list2, list3))
# prints: [[1, 4, 5], [2], [3], [11, 14, 16], [13, 15]]
This is probably the fastest approach from the presented, but again - it doesn't exactly match your examples - check the last result set and the [2]
and [3]
results.
UPDATE :
When it comes to performance, using the second list group:
zwer_join - 100,000 loops: 2.849 s; per loop: 28.399 µs
kuhn_join - 100,000 loops: 3.071 s; per loop: 30.706 µs
nuag_join - 1,000 loops: 15.82 s; per loop: 15.819 ms (had to reduce the number of loops)
This code should do the job:
list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[6,8],[20,2],[20,3],[11,14],[14,16],[13,15]]
list2 = [7,8,20]
list3 = [1,2,3,16,15]
list1a = [set(l) for l in list1]
#removing elements from list1 that contain numbers of list2:
for x in list2:
for l in list(list1a):
if x in l:
l.remove(x)
#joining sub-lists in list1:
list1b = [set(l) for l in list1a]
list1c = []
while list1b:
s1 = list1b.pop(0)
for s2 in list(list1b):
if s1 & s2:
s1 |= s2
list1b.remove(s2)
list1c.append(s1)
#generating final list with only sub-lists that contain elements of list2
list1_new = sorted([sorted(list(s)) for s in list1c if s & set(list3)])
For the first example, this gives:
[[1, 4, 5], [6], [11], [13, 15]]
and for the second example
[[1, 4, 5], [2], [3], [11, 14, 16], [13, 15]]
Hope this helps.
First, you start by writing your join
function. I included the second list to remove unwanted elements.
Then you iterate through your joined list and look if any of the elements are currently in your list. If yes, you look for the place where they belongs, then you add elements (I used set
to avoid duplicate).
Outputs are given at the end.
def join(list1, list2):
l = []
for ee in list1:
# We consider here that list1 only have pairs
if ee[0] not in list2 and ee[1] not in list2:
flat_l = [x for e in l for x in e]
if ee[0] in flat_l or ee[1] in flat_l:
for i, e in enumerate(l):
if ee[0] in e:
l[i].append(ee[1])
if ee[1] in e:
l[i].append(ee[0])
else:
l.append(ee)
return l
def f(list1,list2,list3):
l = [[e] for e in list3]
list1 = join(list1, list2)
for ee in list1:
flat_l = [x for e in l for x in e]
for e in ee:
if e in flat_l:
for i in range(len(l)):
if e in l[i]:
l[i] = list(set(l[i]+ee))
print(l)
list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[13,15]]
list2 = [7,8]
list3 = [1,6,11,13]
f(list1,list2,list3)
# [[1, 4, 5], [6], [11], [13, 15]]
list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[6,8],[20,2],[20,3],[11,14],[14,16],[13,15]]
list2 = [7,8,20]
list3 = [1,2,3,16,15]
f(list1,list2,list3)
# [[1, 4, 5], [2], [3], [16, 11, 14], [13, 15]]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.