How to generate all possible pairs of coordinates without repetition in numpy efficiently

Question

I am trying to generate all pairs of coordinates for pixels in an image with colors, without the pairs repeating (order doesn't matter, so ((1,1,1), (2,2,2) is the same as ((2,2,2), (1,1,1)) and we want to include this pair only once). Also it's important to me that the coordinates are stored in a numpy array.

Let's assume i have a 10x10 image. This means that the image has 100 pixels with 3 color channels what equals to 300 coordinates. This gives us 300*299/2 unique pairs of coordinates. Both using itertools.combinations() or normal python iteration , and then converting to np.array, is painstakingly slow with bigger images (on my pc for 32x32x3 image it takes 5s).

I am able to create a list of all pixels using

all_pixels = np.array(np.meshgrid(range(10), range(10), range(3))).T.reshape(-1, 3)

but that's because we don't have to consider repetitions. Doing that but trying to create pairs of pixels gives me duplicates. I guess i could remove duplicates in some smart way but i have no idea how to do it in an efficient way.

Any help will be greatly appreciated.

It's a bit crude , but for reference this is how i do this now:

    start = time.time()
    x, y, z = shape
    all_pixels = []
    for i in range(x):
        for j in range(y):
            if z > 1:
                for k in range(z):
                    all_pixels.append([i, j, k])
            else:
                all_pixels.append([i, j])
    first_pix = []
    second_pix = []
    for i in range(len(all_pixels)):
        first = all_pixels[i]
        for j in all_pixels[i+1:]:
            second = j
            first_pix.append(first)
            second_pix.append(second)
    print("generation of pixels took " + str(time.time() - start))
    return np.array(first_pix), np.array(second_pix)

Answer 1

Here is a straightforward numpy method, not sure how fast it is:

shape = 10,10,3
np.stack([*map(np.transpose, map(np.unravel_index, np.triu_indices(np.prod(shape),1), 2*(shape,)))],-2)

Output:

array([[[0, 0, 0],
        [0, 0, 1]],

       [[0, 0, 0],
        [0, 0, 2]],

       [[0, 0, 0],
        [0, 1, 0]],

       ...,

       [[9, 9, 0],
        [9, 9, 1]],

       [[9, 9, 0],
        [9, 9, 2]],

       [[9, 9, 1],
        [9, 9, 2]]])

Update: Same idea, same result but faster

np.column_stack(np.unravel_index(np.arange(np.prod(shape)),shape))[np.column_stack(np.triu_indices(np.prod(shape),1))]

Answer 2

Below example is cooler but slower than OP's solution :(

%%timeit
first_pix = []
second_pix = []
for i in range(len(pixels)):
    first = pixels[i]
    for j in pixels[i+1:]:
        second = j
        first_pix.append(first)
        second_pix.append(second)

3.57 ms ± 59.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 

%%timeit
mixed = {frozenset((i, j)) for i in pixels for j in pixels if i != j}

36.5 ms ± 1.33 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Code

Create a list of all pixels, but we use tuples so they are hashable. We will see why later.

img_width = 10
img_height = 10
img_colors = 3

pixels = [(x, y, c) for x in range(img_width) for y in range(img_height) for c in range(3)]

Now we use sets to make sure we have no duplicates.

mixed = {frozenset((i, j)) for i in pixels for j in pixels if i != j}

Now we check that it has the correct number of values:

>>> desired_length = (len(pixels) * (len(pixels) - 1)) / 2
>>> assert len(mixed) == desired_length
True

Explanation

We use a 2 dimensional set comprehension to create permutations. It has the following format:

{(x, y) for x in xs for y in ys}

Because this is a set all items in it will be unique. This requires everything in the set to be hashable ie comparable to eachother for python

Because we do not only want the pixel combinations to be unique, we also want them to be order indepentently unique . Therefore we use a set again, but since a normal set is not hashable we use the internal type frozenset . Which is now actually a set of tuples . Which is hashable and order independent.

>>> frozenset([(0, 0, 1), (0, 0 , 2)]) == frozenset([(0, 0, 2), (0, 0 , 1)])
True

We have to add i != j to make sure we do not enter the same coordinates in a frozenset twice, resulting in an outcome of length 1. (for example set([(0, 0, 1), (0, 0, 1)]) equals {(0, 0, 1)}

>>> frozenset([(0, 0, 1), (0, 0 , 1)])
frozenset({(0, 0, 1)})

Full code

%%timeit

img_width = 10
img_height = 10
img_colors = 3

pixels = [(x, y, c) for x in range(img_width) for y in range(img_height) for c in range(3)]
mixed = {frozenset((i, j)) for i in pixels for j in pixels if i != j}

38.2 ms ± 2.46 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

How to generate all possible pairs of coordinates without repetition in numpy efficiently

Question

2 answers

solution1
2 ACCPTED 2019-08-27 08:55:20

solution2
1 2019-08-27 08:40:07

Below example is cooler but slower than OP's solution :(

Code

Explanation

Full code

How to generate all possible pairs of coordinates without repetition in numpy efficiently

Question

2 answers

solution1 2 ACCPTED 2019-08-27 08:55:20

solution2 1 2019-08-27 08:40:07

Below example is cooler but slower than OP's solution :(

Code

Explanation

Full code

solution1
2 ACCPTED 2019-08-27 08:55:20

solution2
1 2019-08-27 08:40:07