简体   繁体   中英

Python link sublists together that sublists have no element in list2 and have a element in list3

I need to group sublists with the same elements together and all element in sublists have no element in list2 and have a element in list3 For example:

list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[13,15]]
list2 = [7,8]
list3 = [1,6,11,13]

I would link [4,5] and [1,4] together since they both contain 1 same number and those two would combine into [1,4,5] and they contain 1 in list3 and not contain 7 in list2

So after linking, the new list should be like:

new_list1 = [[1,4,5],[6],[11],[13,15]]

IE: there shouldn't be same number inside a sub-list and order is not important.

A longer example:

list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[6,8],[20,2],[20,3],[11,14],[14,16],[13,15]]
list2 = [7,8,20]
list3 = [1,2,3,16,15]

after linking, it would be

new_list = [[1,4,5],[2,3],[11,14,16],[13,15]]

How can this be done in a general way?

EDIT The final algorithm should comprise of the following three basic steps:

  1. Remove all elements of all sub-lists of list1 that are contained in list2
  2. Join all sub-lists of list1 that have common elements
  3. Remove all sub-lists of list1 that do not contain any elements of list3

Here's my take on it, if Thomas Kühn managed to properly read your mind:

def subgroup_join(data, exclude, include):
    exclude = set(exclude)  # turn into set for faster lookup/compare
    include = set(include)  # turn into set for faster lookup/compare
    data = [set(element) - exclude for element in data]  # remove excluded elements
    results = [set()]  # init with an empty set
    for element in data:  # loop through our remaining elements
        groups = []  # store elements / current results filtered by exclude list
        ignore_element = False  # flag if we should add the element as a standalone
        for result in results:  # go through each subgroup in the results
            if element & result:  # if the current element has common items with the result
                result |= element  # ... concatenate both into a subgroup
                ignore_element = True
            groups.append(result)  # add the current result subgroup
        if not ignore_element:  # add only if the element wasn't concatenated
            groups.append(element)  # add the current element
        results = groups  # our element store becomes our new results set
    return sorted([sorted(res) for res in results if result & include])  # sort & return

As for tests:

list1 = [[1, 4], [4, 5], [5, 7], [6, 7], [7, 8], [9, 7], [10, 9], [8, 10], [8, 11], [8, 13], [13, 15]]
list2 = [7, 8]
list3 = [1, 6, 11, 13]

print(subgroup_join(list1, list2, list3))
# prints: [[1, 4, 5], [6], [11], [13, 15]]

list1 = [[1, 4], [4, 5], [5, 7], [6, 7], [9, 7], [10, 9], [8, 10], [8, 11], [8, 13], [6, 8], [20, 2], [20, 3], [11, 14], [14, 16], [13, 15]]
list2 = [7, 8, 20]
list3 = [1, 2, 3, 16, 15]

print(subgroup_join(list1, list2, list3))
# prints: [[1, 4, 5], [2], [3], [11, 14, 16], [13, 15]]

This is probably the fastest approach from the presented, but again - it doesn't exactly match your examples - check the last result set and the [2] and [3] results.

UPDATE :

When it comes to performance, using the second list group:

zwer_join - 100,000 loops: 2.849 s; per loop: 28.399 µs
kuhn_join - 100,000 loops: 3.071 s; per loop: 30.706 µs
nuag_join -   1,000 loops: 15.82 s; per loop: 15.819 ms (had to reduce the number of loops)

This code should do the job:

list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[6,8],[20,2],[20,3],[11,14],[14,16],[13,15]]
list2 = [7,8,20]
list3 = [1,2,3,16,15]

list1a = [set(l) for l in list1]
#removing elements from list1 that contain numbers of list2:
for x in list2:
    for l in list(list1a):
        if x in l:
            l.remove(x)

#joining sub-lists in list1:
list1b = [set(l) for l in list1a]
list1c = []
while list1b:
    s1 = list1b.pop(0)
    for s2 in list(list1b):
        if s1 & s2:
            s1 |= s2
            list1b.remove(s2)
    list1c.append(s1)

#generating final list with only sub-lists that contain elements of list2
list1_new = sorted([sorted(list(s)) for s in list1c if s & set(list3)])

For the first example, this gives:

[[1, 4, 5], [6], [11], [13, 15]]

and for the second example

[[1, 4, 5], [2], [3], [11, 14, 16], [13, 15]]

Hope this helps.

First, you start by writing your join function. I included the second list to remove unwanted elements.

Then you iterate through your joined list and look if any of the elements are currently in your list. If yes, you look for the place where they belongs, then you add elements (I used set to avoid duplicate).

Outputs are given at the end.

def join(list1, list2):
    l = []
    for ee in list1:
        # We consider here that list1 only have pairs
        if ee[0] not in list2 and ee[1] not in list2:
            flat_l = [x for e in l for x in e]
            if ee[0] in flat_l or ee[1] in flat_l:
                for i, e in enumerate(l):
                    if ee[0] in e:
                        l[i].append(ee[1])
                    if ee[1] in e:
                        l[i].append(ee[0])
            else:
                l.append(ee)
    return l

def f(list1,list2,list3):
    l = [[e] for e in list3]
    list1 = join(list1, list2)
    for ee in list1:
        flat_l = [x for e in l for x in e]
        for e in ee:
            if e in flat_l:
                for i in range(len(l)):
                    if e in l[i]:
                        l[i] = list(set(l[i]+ee))
    print(l)

list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[13,15]]
list2 = [7,8]
list3 = [1,6,11,13]

f(list1,list2,list3)
# [[1, 4, 5], [6], [11], [13, 15]]

list1 = [[1,4],[4,5],[5,7],[6,7],[9,7],[10,9],[8,10],[8,11],[8,13],[6,8],[20,2],[20,3],[11,14],[14,16],[13,15]]
list2 = [7,8,20]
list3 = [1,2,3,16,15]

f(list1,list2,list3)
# [[1, 4, 5], [2], [3], [16, 11, 14], [13, 15]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM