I know that itertools has a method for generating combinations, as described here: Get unique combinations of elements from a python list . However, I'm looking for an iterator that gives unique combinations and their multiplicities.
Example: I have an expression that only depends on which combination of 2 elements I select from a list L = [2,1,2,2]. I need to sum the result for all combinations. What I want is an iterator that gives eg (([1,2], 3), ([2,2], 3)). That way, I can compute the expression for just the 2 unique combinations and multiply by 3, rather than computing for all 6 combinations, of which many give the same result.
You can combine itertools.combinations
with collections.Counter
.
import itertools
import collections
L = [2,1,2,2]
c = collections.Counter()
c.update(map(tuple, map(sorted, itertools.combinations(L, 2))))
c.items()
then gives:
>>> c.items()
[((1, 2), 3), ((2, 2), 3)]
To break it down, itertools.combinations(L, 2)
gives all the ordered combinations of L
of length 2. We then use sorted
to make them comparable since collections.Counter
will use hashing and equality to count. Finally, because list
objects are not hashable, we convert them to tuple
objects which are.
In the end, my code took too long to explicitly count every possible combination, so I came up with a way to find only the unique ones and then analytically compute their multiplicities. It's based on the following idea: Call the input list A and the number of elements in each subset k. First sort the list and initialize k pointers to the first k elements of A. Then repeatedly attempt to move the rightmost pointer to the right until it encounters a new value. Every time another pointer than the rightmost is moved, all pointers to its right are set to its neighbors, eg if pointer 1 is moved to index 6, pointer 2 is moved to index 7 and so on.
The multiplicity of any combination C can be found by multiplying the binomial coefficients (N_i, m_i) where N_i and m_i are the number of occurrences of element i in A and C, respectively.
Below is an implementation of a brute force approach, and a method which exploits uniqueness.
This figure compares the runtime of brute force counting vs. my approach. Counting becomes infeasible when the input list has about 20 elements.
# -*- coding: utf-8 -*-
from __future__ import division
from itertools import combinations
from collections import Counter
from operator import mul
import numpy as np
from scipy.special import binom
def brute(A, k):
'''This works, but counts every combination.'''
A_sorted = sorted(A)
d = {}
for comb in combinations(A_sorted, k):
try:
d[comb] += 1
except KeyError:
d[comb] = 1
#
return d
def get_unique_unordered_combinations(A, k):
'''Returns all unique unordered subsets with size k of input array.'''
# If we're picking zero elements, we can only do it in one way. Duh.
if k < 0:
raise ValueError("k must be non-negative")
if k == 0 or k > len(A):
yield ()
return # Done. There's only one way to select zero elements :)
# Sorted version of input list
A = np.array(sorted(A))
# Indices of currently selected combination
inds = range(k)
# Pointer to the index we're currently trying to increment
lastptr = len(inds) - 1
# Construct list of indices of next element of A different from current.
# e.g. [1,1,1,2,2,7] -> [3,3,3,5,5,6] (6 falls off list)
skipper = [len(A) for a in A]
prevind = 0
for i in xrange(1, len(A)):
if A[i] != A[prevind]:
for j in xrange(prevind, i):
skipper[j] = i
prevind = i
#
while True:
# Yield current combination from current indices
comb = tuple(A[inds])
yield comb
# Try attempt to change indices, starting with rightmost index
for p in xrange(lastptr, -1 , -1):
nextind = skipper[inds[p]]
#print "Trying to increment index %d to %d" % (inds[p], nextind)
if nextind + (lastptr - p) >= len(A):
continue # No room to move this pointer. Try the next
#print "great success"
for i in xrange(lastptr-p+1):
inds[p+i] = nextind + i
break
else:
# We've exhausted all possibilities, so there are no combs left
return
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.