简体   繁体   中英

Find unique pairs of values in Tensorflow

In Python 3.X + TensorFlow, if I have two TF vectors, point_x and point_y (same shape) that represent the X and Y coordinates of some number of pointers, how do I find all unique points?

I was able to hack this together in Theano using a complex vector, with X in the real and Y in the imaginary portion:

complex_points = point_x + point_y * 1j
unique_points, idxs, groups = T.extra_ops.Unique(True, True, False)(complex_points)

The TF equivalent I'm trying is:

complex_points = tf.complex(point_x, point_y)
unique_points, groups = tf.unique(complex_points)

TensorFlow fails with something like:

InvalidArgumentError: No OpKernel was registered to support Op 'Unique' with these attrs.
... # supported types include the float/int/string types, no complex types
[[Node: Unique_1 = Unique[T=DT_COMPLEX64, out_idx=DT_INT32](Complex_1)]]

Clearly, no one's implemented/registered a complex version of the "unique" op. Any idea how to accomplish this task?

Well, here's an even hacker solution: use bit-level cast.

If you tensor are all of type tf.float32, you can use:

xy = tf.transpose(tf.pack([point_x, point_y]))
xy64 = tf.bitcast(xy, type=tf.float64)
unique64, idx = tf.unique(xy64)
unique_points = tf.bitcast(unique64, type=tf.float32)

The principle behind this is to put x and y coordinates together and let TensorFlow treat an (x, y) pair as a longer float, then tf.unique works for this 1-D tensor. Finally, convert the longer float to two genuine floats, as we desired.

Note: This method is really hacky, and you have a risk of suffering from Nan or infinity or some strange values. But the chance is really slim.

Another possible work around is, if your data type is integer, you can pack two integers into one, like what a compiler does when it convert 2-d indices into 1-d ones. Say, if x = [1, 2, 3, 2], y = [0, 1, 0, 1], you can compress x and y into one tensor by x*10+y (10 is a large enough number. Any value larger than max(y) should work), then find unique values in this compressed array.

Lastly, if you don't have any reason to do this inside TensorFlow, it might be better to do it outside, say, in numpy. You can evaluate the tensors, and remove duplicate values in numpy, then use these numpy arrays to generate new tensors and feed to the rest of your network.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM