简体   繁体   中英

Delete certain elements of a numpy array

I have two numpy arrays a and b . I have a definition that construct an array c whose elements are all the possible sums of different elements of a.

import numpy as np

def Sumarray(a):
    n = len(a)

    sumarray = np.array([0]) # Add a default zero element
    for k in range(2,n+1):
        full = np.mgrid[k*(slice(n),)]
        nd_triu_idx = full[:,(np.diff(full,axis=0)>0).all(axis=0)]
        sumarray = np.append(sumarray, a[nd_triu_idx].sum(axis=0))

    return sumarray

a = np.array([1,2,6,8])
c = Sumarray(a)
print(d)

I then perform a subsetsum between an element of c and b : isSubsetSum returns the elements of b that when summed gives c[1]. Let's say that I get

c[0] = b[2] + b[3]

Then I want to remove:

  1. the elements b[2], b[3] (easy bit), and
  2. the elements of a that when summed gave c[0]

As you can see from the definition, Sumarray , the order of sums of different elements of a are preserved, so I need to realise some mapping.

The function isSubsetSum is given by

def _isSubsetSum(numbers, n, x, indices):
    if (x == 0):
        return True
    if (n == 0 and x != 0):
        return False
    # If last element is greater than x, then ignore it
    if (numbers[n - 1] > x):
        return _isSubsetSum(numbers, n - 1, x, indices)
    # else, check if x can be obtained by any of the following
    found = _isSubsetSum(numbers, n - 1, x, indices)
    if found: return True
    indices.insert(0, n - 1)
    found = _isSubsetSum(numbers, n - 1, x - numbers[n - 1], indices)
    if not found: indices.pop(0)
    return found

def isSubsetSum(numbers, x):
    indices = []
    found = _isSubsetSum(numbers, len(numbers), x, indices)
    return indices if found else None

As you are iterating over all possible numbers of terms, you could as well directly generate all possible subsets.

These can be conveniently encoded as numbers 0,1,2,... by means of their binary representations: O means no terms at all, 1 means only the first term, 2 means only the second, 3 means the first and the second and so on.

Using this scheme it becomes very easy to recover the terms from the sum index because all we need to do is obtain the binary representation:

UPDATE: we can suppress 1-term-sums with a small amount of extra code:

import numpy as np

def find_all_subsums(a,drop_singletons=False):
    n = len(a)
    assert n<=32 # this gives 4G subsets, and we have to cut somewhere
    # compute the smallest integer type with enough bits
    dt = f"<u{1<<((n-1)>>3).bit_length()}"
    # the numbers 0 to 2^n encode all possible subsets of an n
    # element set by means of their binary representation
    # each bit corresponds to one element number k represents the 
    # subset consisting of all elements whose bit is set in k
    rng = np.arange(1<<n,dtype=dt)
    if drop_singletons:
        # one element subsets correspond to powers of two
        rng = np.delete(rng,1<<np.arange(n))
    # np.unpackbits transforms bytes to their binary representation
    # given the a bitvector b we can compute the corresponding subsum
    # as b dot a, to do it in bulk we can mutliply the matrix of 
    # binary rows with a
    return np.unpackbits(rng[...,None].view('u1'),
                         axis=1,count=n,bitorder='little') @ a

def show_terms(a,idx,drop_singletons=False):
    n = len(a)
    if drop_singletons:
        # we must undo the dropping of powers of two to get an index
        # that is easy to translate. One can check that the following
        # formula does the trick
        idx += (idx+idx.bit_length()).bit_length()
        # now we can simply use the binary representation
    return a[np.unpackbits(np.asarray(idx,dtype='<u8')[None].view('u1'),
                           count=n,bitorder='little').view('?')]


example = np.logspace(1,7,7,base=3)
ss = find_all_subsums(example,True)
# check every single sum
for i,s in enumerate(ss):
    assert show_terms(example,i,True).sum() == s
# print one example
idx = 77
print(ss[idx],"="," + ".join(show_terms(example.astype('U'),idx,True)))

Sample run:

2457.0 = 27.0 + 243.0 + 2187.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM