简体   繁体   English

删除 numpy 数组的某些元素

[英]Delete certain elements of a numpy array

I have two numpy arrays a and b .我有两个 numpy arrays ab I have a definition that construct an array c whose elements are all the possible sums of different elements of a.我有一个定义,它构造了一个数组c ,其元素是 a 的不同元素的所有可能总和。

import numpy as np

def Sumarray(a):
    n = len(a)

    sumarray = np.array([0]) # Add a default zero element
    for k in range(2,n+1):
        full = np.mgrid[k*(slice(n),)]
        nd_triu_idx = full[:,(np.diff(full,axis=0)>0).all(axis=0)]
        sumarray = np.append(sumarray, a[nd_triu_idx].sum(axis=0))

    return sumarray

a = np.array([1,2,6,8])
c = Sumarray(a)
print(d)

I then perform a subsetsum between an element of c and b : isSubsetSum returns the elements of b that when summed gives c[1].然后,我在c的元素和b之间执行子集和: isSubsetSum返回b的元素,求和后得到 c[1]。 Let's say that I get假设我得到

c[0] = b[2] + b[3]

Then I want to remove:然后我想删除:

  1. the elements b[2], b[3] (easy bit), and元素 b[2]、b[3](简单位)和
  2. the elements of a that when summed gave c[0] a的元素相加后得到 c[0]

As you can see from the definition, Sumarray , the order of sums of different elements of a are preserved, so I need to realise some mapping.从定义中可以看出, Sumarray ,保留了a的不同元素之和的顺序,所以我需要实现一些映射。

The function isSubsetSum is given by function isSubsetSum由下式给出

def _isSubsetSum(numbers, n, x, indices):
    if (x == 0):
        return True
    if (n == 0 and x != 0):
        return False
    # If last element is greater than x, then ignore it
    if (numbers[n - 1] > x):
        return _isSubsetSum(numbers, n - 1, x, indices)
    # else, check if x can be obtained by any of the following
    found = _isSubsetSum(numbers, n - 1, x, indices)
    if found: return True
    indices.insert(0, n - 1)
    found = _isSubsetSum(numbers, n - 1, x - numbers[n - 1], indices)
    if not found: indices.pop(0)
    return found

def isSubsetSum(numbers, x):
    indices = []
    found = _isSubsetSum(numbers, len(numbers), x, indices)
    return indices if found else None

As you are iterating over all possible numbers of terms, you could as well directly generate all possible subsets.当您迭代所有可能的术语数量时,您也可以直接生成所有可能的子集。

These can be conveniently encoded as numbers 0,1,2,... by means of their binary representations: O means no terms at all, 1 means only the first term, 2 means only the second, 3 means the first and the second and so on.这些可以通过它们的二进制表示方便地编码为数字 0,1,2,...:O 表示根本没有项,1 表示只有第一项,2 表示只有第二项,3 表示第一个和第二个等等。

Using this scheme it becomes very easy to recover the terms from the sum index because all we need to do is obtain the binary representation:使用这种方案,从总和索引中恢复项变得非常容易,因为我们需要做的就是获得二进制表示:

UPDATE: we can suppress 1-term-sums with a small amount of extra code:更新:我们可以用少量的额外代码来抑制 1-term-sums:

import numpy as np

def find_all_subsums(a,drop_singletons=False):
    n = len(a)
    assert n<=32 # this gives 4G subsets, and we have to cut somewhere
    # compute the smallest integer type with enough bits
    dt = f"<u{1<<((n-1)>>3).bit_length()}"
    # the numbers 0 to 2^n encode all possible subsets of an n
    # element set by means of their binary representation
    # each bit corresponds to one element number k represents the 
    # subset consisting of all elements whose bit is set in k
    rng = np.arange(1<<n,dtype=dt)
    if drop_singletons:
        # one element subsets correspond to powers of two
        rng = np.delete(rng,1<<np.arange(n))
    # np.unpackbits transforms bytes to their binary representation
    # given the a bitvector b we can compute the corresponding subsum
    # as b dot a, to do it in bulk we can mutliply the matrix of 
    # binary rows with a
    return np.unpackbits(rng[...,None].view('u1'),
                         axis=1,count=n,bitorder='little') @ a

def show_terms(a,idx,drop_singletons=False):
    n = len(a)
    if drop_singletons:
        # we must undo the dropping of powers of two to get an index
        # that is easy to translate. One can check that the following
        # formula does the trick
        idx += (idx+idx.bit_length()).bit_length()
        # now we can simply use the binary representation
    return a[np.unpackbits(np.asarray(idx,dtype='<u8')[None].view('u1'),
                           count=n,bitorder='little').view('?')]


example = np.logspace(1,7,7,base=3)
ss = find_all_subsums(example,True)
# check every single sum
for i,s in enumerate(ss):
    assert show_terms(example,i,True).sum() == s
# print one example
idx = 77
print(ss[idx],"="," + ".join(show_terms(example.astype('U'),idx,True)))

Sample run:样品运行:

2457.0 = 27.0 + 243.0 + 2187.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM