[英]Delete certain elements of a numpy array
I have two numpy arrays a
and b
.我有两个 numpy arrays
a
和b
。 I have a definition that construct an array c
whose elements are all the possible sums of different elements of a.我有一个定义,它构造了一个数组
c
,其元素是 a 的不同元素的所有可能总和。
import numpy as np
def Sumarray(a):
n = len(a)
sumarray = np.array([0]) # Add a default zero element
for k in range(2,n+1):
full = np.mgrid[k*(slice(n),)]
nd_triu_idx = full[:,(np.diff(full,axis=0)>0).all(axis=0)]
sumarray = np.append(sumarray, a[nd_triu_idx].sum(axis=0))
return sumarray
a = np.array([1,2,6,8])
c = Sumarray(a)
print(d)
I then perform a subsetsum between an element of c
and b
: isSubsetSum
returns the elements of b
that when summed gives c[1].然后,我在
c
的元素和b
之间执行子集和: isSubsetSum
返回b
的元素,求和后得到 c[1]。 Let's say that I get假设我得到
c[0] = b[2] + b[3]
Then I want to remove:然后我想删除:
a
that when summed gave c[0] a
的元素相加后得到 c[0] As you can see from the definition, Sumarray
, the order of sums of different elements of a
are preserved, so I need to realise some mapping.从定义中可以看出,
Sumarray
,保留了a
的不同元素之和的顺序,所以我需要实现一些映射。
The function isSubsetSum
is given by function
isSubsetSum
由下式给出
def _isSubsetSum(numbers, n, x, indices):
if (x == 0):
return True
if (n == 0 and x != 0):
return False
# If last element is greater than x, then ignore it
if (numbers[n - 1] > x):
return _isSubsetSum(numbers, n - 1, x, indices)
# else, check if x can be obtained by any of the following
found = _isSubsetSum(numbers, n - 1, x, indices)
if found: return True
indices.insert(0, n - 1)
found = _isSubsetSum(numbers, n - 1, x - numbers[n - 1], indices)
if not found: indices.pop(0)
return found
def isSubsetSum(numbers, x):
indices = []
found = _isSubsetSum(numbers, len(numbers), x, indices)
return indices if found else None
As you are iterating over all possible numbers of terms, you could as well directly generate all possible subsets.当您迭代所有可能的术语数量时,您也可以直接生成所有可能的子集。
These can be conveniently encoded as numbers 0,1,2,... by means of their binary representations: O means no terms at all, 1 means only the first term, 2 means only the second, 3 means the first and the second and so on.这些可以通过它们的二进制表示方便地编码为数字 0,1,2,...:O 表示根本没有项,1 表示只有第一项,2 表示只有第二项,3 表示第一个和第二个等等。
Using this scheme it becomes very easy to recover the terms from the sum index because all we need to do is obtain the binary representation:使用这种方案,从总和索引中恢复项变得非常容易,因为我们需要做的就是获得二进制表示:
UPDATE: we can suppress 1-term-sums with a small amount of extra code:更新:我们可以用少量的额外代码来抑制 1-term-sums:
import numpy as np
def find_all_subsums(a,drop_singletons=False):
n = len(a)
assert n<=32 # this gives 4G subsets, and we have to cut somewhere
# compute the smallest integer type with enough bits
dt = f"<u{1<<((n-1)>>3).bit_length()}"
# the numbers 0 to 2^n encode all possible subsets of an n
# element set by means of their binary representation
# each bit corresponds to one element number k represents the
# subset consisting of all elements whose bit is set in k
rng = np.arange(1<<n,dtype=dt)
if drop_singletons:
# one element subsets correspond to powers of two
rng = np.delete(rng,1<<np.arange(n))
# np.unpackbits transforms bytes to their binary representation
# given the a bitvector b we can compute the corresponding subsum
# as b dot a, to do it in bulk we can mutliply the matrix of
# binary rows with a
return np.unpackbits(rng[...,None].view('u1'),
axis=1,count=n,bitorder='little') @ a
def show_terms(a,idx,drop_singletons=False):
n = len(a)
if drop_singletons:
# we must undo the dropping of powers of two to get an index
# that is easy to translate. One can check that the following
# formula does the trick
idx += (idx+idx.bit_length()).bit_length()
# now we can simply use the binary representation
return a[np.unpackbits(np.asarray(idx,dtype='<u8')[None].view('u1'),
count=n,bitorder='little').view('?')]
example = np.logspace(1,7,7,base=3)
ss = find_all_subsums(example,True)
# check every single sum
for i,s in enumerate(ss):
assert show_terms(example,i,True).sum() == s
# print one example
idx = 77
print(ss[idx],"="," + ".join(show_terms(example.astype('U'),idx,True)))
Sample run:样品运行:
2457.0 = 27.0 + 243.0 + 2187.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.