简体   繁体   中英

Split list into multiple groups based on co-existence rules

I would like to create groups of strings (or objects) based on 'exclusion' rules - ie which item can exist or 'talk' to other items

For example, lets say I have a list of names:

names = ['proxy', 's1', 's2', 'queue', 'w1']

Then I say, 'proxy' can talk with 's1' and 's2' but 's1' cannot talk to 's2' (and this should be mutual). I can represent these rules and names as a list of objects:

proxy = {'name': 'proxy', 'exclude': [q, 'w1']}
s1 = {'name': 's1', 'exclude': ['s2', 'w1']}
s2 = {'name': 's2', 'exclude': ['s1', 'w1']}
q = {'name': 'queue', 'exclude': ['proxy']}
w1 = {'name': 'w1', 'exclude': ['proxy', 's1', 's2']}

Here I would expect to end up with 5 groups:

[
  ['proxy', 's1'],
  ['proxy', 's2'],
  ['s1', 'queue'],
  ['s2', 'queue'],
  ['queue', 'w1']
]

I have tried to use set.difference on the full list of names for each 'exclusion' and then remove any sets that are equal, but this is not enough, for example for the first item - 'proxy', I would end up with a set of ['proxy', 's1', 's2'] but ofc 's1' and 's2' cannot be together.

I am stumped for a solution to this problem, but I having a feeling it is or is similar to common mathematical/set theory problems?

Extra Info: As mentioned nicely by @MrFuppes connections are bidirectional (ie if X excludes Z then Z should exclude X) and this should be assumed and if possible inferred. This is to make it simpler for the user so they do not have to explicitly state both rules. It may be that my 'rule schema' is not the optimal way of gathering the data from the user to solve the problem and if there is a more optimal way I am all for it.

Definitely not the most beautiful Python ever written, but it will do the trick. I took the liberty to collect the rules in a rules dictionary and renamed q to queue for consistency. If that's a problem, we'll find a workaround

rules = {'proxy':{'name': 'proxy', 'exclude': ['queue', 'w1']},
         's1': {'name': 's1', 'exclude': ['s2', 'w1']},
         's2':{'name': 's2', 'exclude': ['s1', 'w1']},
         'queue': {'name': 'queue', 'exclude': ['proxy']},
         'w1': {'name': 'w1', 'exclude': ['proxy', 's1', 's2']}}
names = ['proxy', 's1', 's2', 'queue', 'w1']

And now for the code itself. I'm first deep-copying the rules dict because I'll be adding connections already listed as excludes and don't want to mess with the original rule set. The rest is pretty straight forward, so I'll not comment too much on it: get all allowed names by dropping the connector itself and all blocked connections, add each connection to the connection list and add the connector as a blocked element to all connections affected to avoid doubling entries

import copy
conns = []
working_rules = copy.deepcopy(rules)
for item in names:
    tmp_names = names.copy()
    tmp_names.remove(item)
    allowed = [el for el in tmp_names if el not in working_rules[item]['exclude']]
    for el in allowed:
        conns.append([item, el])
        working_rules[el]['exclude'].append(item)

Given the expected output, the input rules are "incomplete", meaning mutuality of exclusions has to be inferred - as Lukas Thaler shows in his answer . The inferred exclusion rules will then depend on the order of the input names . A simplified version using sets might look as follows:

excl = {'proxy': ['queue', 'w1'],
        's1': ['s2', 'w1'],
        's2': ['s1', 'w1'],
        'queue': ['proxy'],
        'w1': ['proxy', 's1', 's2']}

names = ['proxy', 's1', 's2', 'queue', 'w1']

result = []
for n in names:
    for i in set(names)-set(excl[n] + [n]):
        result.append([n, i])
        excl[i].append(n)

print(result)

# [['proxy', 's1'],
#  ['proxy', 's2'],
#  ['s1', 'queue'],
#  ['s2', 'queue'],
#  ['queue', 'w1']]

Updated exclusion rules would now be

{'proxy': ['queue', 'w1'],
    's1': ['s2', 'w1', 'proxy'],
    's2': ['s1', 'w1', 'proxy'],
 'queue': ['proxy', 's1', 's2'],
    'w1': ['proxy', 's1', 's2', 'queue']}

EDIT #1

If I'm not mistaken, a priori assuming bidirectional communication would make things easier (and also make for a clear definition of rules in the first place). Example: Given objects a, b, c , one might say that a should communicate with b and vice versa. a should also communicate with c but not vice versa. b should not communicate with c and vice versa. That would make

objs = ['a', 'b', 'c']
excl = {'a': [], 
        'b': ['c'],
        'c': ['a', 'b']}

connections = [[o, i] for o in objs for i in set(objs)-set(excl[o] + [o])]
print(connections)
# [['a', 'b'], ['a', 'c'], ['b', 'a']]

EDIT #2

Simplifying the connections list could be then done by sorting into bi- and unidirectional components, eg

conn_grouped = {'bidirectional': [], 'unidirectional': []}
for c in connections:
    rev_c = list(reversed(c))
    if rev_c not in connections:
        conn_grouped['unidirectional'].append(c)
    if rev_c in connections and not rev_c in conn_grouped['bidirectional']:
        conn_grouped['bidirectional'].append(c)

print(conn_grouped)
# {'bidirectional': [['a', 'b']], 'unidirectional': [['a', 'c']]}

If the number of communicating nodes gets large, it might be required to find a more efficient algorithm.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM