Python - Best way to find the 1d center of mass in a binary numpy array

Question

Suppose I have the following Numpy array, in which I have one and only one continuous slice of 1 s:

import numpy as np
x = np.array([0,0,0,0,1,1,1,0,0,0], dtype=1)

and I want to find the index of the 1D center of mass of the 1 elements. I could type the following:

idx = np.where( x )[0]
idx_center_of_mass = int(0.5*(idx.max() + idx.min()))
# this would give 5

(Of course this would lead to rough approximation when the number of elements of the 1 s slice is even.) Is there any better way to do this, like a computationally more efficient oneliner?

Answer 1

Can't you simply do the following?

center_of_mass = (x*np.arange(len(x))).sum()/x.sum() # 5

%timeit center_of_mass = (x*arange(len(x))).sum()/x.sum()
# 100000 loops, best of 3: 10.4 µs per loop

Answer 2

As one approach we can get the non-zero indices and get the mean of those as the center of mass, like so -

np.flatnonzero(x).mean()

Here's another approach using shifted array comparison to get the start and stop indices of that slice and getting the mean of those indices for determining the center of mass, like so -

np.flatnonzero(x[:-1] != x[1:]).mean()+0.5

Runtime test -

In [72]: x = np.zeros(10000,dtype=int)

In [73]: x[100:2000] = 1

In [74]: %timeit np.flatnonzero(x).mean()
10000 loops, best of 3: 115 µs per loop

In [75]: %timeit np.flatnonzero(x[:-1] != x[1:]).mean()+0.5
10000 loops, best of 3: 38.7 µs per loop

We can improve the performance by some margin here with the use of np.nonzero()[0] to replace np.flatnonzero and np.sum in place of np.mean -

In [107]: %timeit (np.nonzero(x[:-1] != x[1:])[0].sum()+1)/2.0
10000 loops, best of 3: 30.6 µs per loop

Alternatively, for the second approach, we can store the start and stop indices and then simply add them to get the center of mass for a bit more efficient approach as we would avoid the function call to np.mean , like so -

start,stop = np.flatnonzero(x[:-1] != x[1:])
out = (stop + start + 1)/2.0

Timings -

In [90]: %timeit start,stop = np.flatnonzero(x[:-1] != x[1:])
10000 loops, best of 3: 21.3 µs per loop

In [91]: %timeit (stop + start + 1)/2.0
100000 loops, best of 3: 4.45 µs per loop

Again, we can experiment with np.nonzero()[0] here.

Python - Best way to find the 1d center of mass in a binary numpy array

Question

2 answers

solution1
3 2016-09-27 08:18:21

solution2
2 ACCPTED 2016-09-27 07:58:24

Python - Best way to find the 1d center of mass in a binary numpy array

Question

2 answers

solution1 3 2016-09-27 08:18:21

solution2 2 ACCPTED 2016-09-27 07:58:24

solution1
3 2016-09-27 08:18:21

solution2
2 ACCPTED 2016-09-27 07:58:24