My program generates lists like this:
mydata = ["foo", "bar", "baz", "quux", "quid", "quo"]
And I know from other data that these can be grouped in couples (here a list of tuples, but can be changed to whatever):
static_mapping = [("foo", "quo"), ("baz", "quux"), ("quid", "bar")]
There's no ordering in the couples.
Now on to the problem: my program generates mydata
and I need to group data by couple but keeping a separate list of non-matched items. The reason is that at any moment mydata
may not contain all items that are part of the couples.
Expected results on such a hypothetical function:
mydata = ["foo", "bar", "quo", "baz"]
couples, remainder = group_and_split(mydata, static_mapping)
print(couples)
[("foo", "quo")]
print(remainder)
["bar", "baz"]
EDIT: Examples of what I've tried (but they stop at finding the coupling):
found_pairs = list()
for coupling in static_mapping:
pairs = set(mydata).intersect(set(coupling))
if not pairs or len(pairs) != 2:
continue
found_pairs.append(pairs)
I got stuck at finding a reliable way to get the reminder out.
You may try this:
import copy
def group_and_split(mydata, static_mapping):
remainder = copy.deepcopy(mydata)
couples = []
for couple in static_mapping:
if couple[0] in mydata and couple[1] in mydata:
remainder.remove(couple[0])
remainder.remove(couple[1])
couples.append(couple)
return [couples, remainder]
Set gives you faster runtime if values are big, but takes memory, and deepcopy keeps the original data intact.
One of the implementations of the hypothetical functions could be :-
from copy import deepcopy
def group_and_split(mydata, static_mapping):
temp = set(mydata)
couples = []
remainder = deepcopy(mydata)
for value1,value2 in static_mapping:
if value1 in temp and value2 in temp:
couples.append((value1,value2))
remainder.remove(value1)
remainder.remove(value2)
return couples, remainder
Here you have how to do it with sets
using symmetric_difference
function:
>>> full = ["foo", "quo", "baz", "quux", "quid", "bar", 'newone', 'anotherone']
>>> couples = [("foo", "quo"), ("baz", "quux"), ("quid", "bar")]
## now for the remainder. Note you have to flatten your couples:
>>> set(full).symmetric_difference([item for sublist in couples for item in sublist])
set(['anotherone', 'newone'])
>>>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.