[英]tensorflow indexing through multidimensional array
I've got this matrix of probabilities here and I'm trying to index them to get one of the probabilities in each row so I can log them. 我在这里有这个概率矩阵,并且我试图为它们建立索引以获取每一行中的一个概率,以便我可以对其进行记录。
p_matrix =
[[0.5 0.5 ]
[0.45384845 0.5461515 ]
[0.45384845 0.5461515 ]
[0.45384845 0.5461515 ]
[0.48519668 0.51480335]
[0.48257706 0.517423 ]
[0.48257706 0.517423 ]
[0.48257706 0.517423 ]
[0.4807878 0.5192122 ]
[0.45384845 0.5461515 ]
[0.48257703 0.517423 ]]
The indexes are stored in a placeholder a = tf.placeholder(shape=None, dtype=tf.int32)
索引存储在占位符中a = tf.placeholder(shape=None, dtype=tf.int32)
Normally I would simply do p_matrix[np.arange(a.shape[0], dtype=np.int32), a]
通常我会简单地做p_matrix[np.arange(a.shape[0], dtype=np.int32), a]
in order to grab the corresponding results but this gives me an error 为了获取相应的结果,但这给了我一个错误
IndexError: arrays used as indices must be of integer (or boolean) type
Using a standard numpy array in place of a
gives me the desired result. 使用标准的numpy数组代替a
得到期望的结果。 I thought it might be something specific about using dtype=tf.int32
but I get the same result if I change the dtype
of the placeholder to np.int32
. 我想这可能是具体如何使用一些dtype=tf.int32
,但我得到了同样的结果,如果我改变dtype
占位符来np.int32
。
Also when I get the type
of a
it returns <class 'numpy.ndarray'>
and for a[0]
it returns <class 'numpy.int32'>
. 此外,当我得到的type
的a
它返回<class 'numpy.ndarray'>
和a[0]
它返回<class 'numpy.int32'>
Any ideas? 有任何想法吗?
To summarize: 总结一下:
x = np.arange(a.shape[0])
y = np.array(list(a))
print(action_prob[x,y]) # This works.
print(action_prob[x,a]) # This does not work.
type(a) = <class 'numpy.ndarray'>
type(y) = <class 'numpy.ndarray'>
I can only assume it's because one is a tf.placeholder
and as a result I can't specify this in the graph initialization? 我只能假设这是因为一个是tf.placeholder
,结果我不能在图形初始化中指定它?
EDIT: 编辑:
Sample code: 样例代码:
class Model():
def __init__(self, sess, s_size, game, lr=0.001):
f_size = 12
self.input = tf.placeholder(shape=[None, f_size], dtype=tf.float32)
self.action = tf.placeholder(shape=None, dtype=tf.int32)
self.p_matrix = tf.contrib.layers.fully_connected(self.state,
20, activation_fn=tf.nn.softmax, biases_initializer=None)
# Here I need to select the correct p_values
self.log_prob = tf.log(self.action_prob[p_selected])
self.train = tf.train.AdamOptimizer(lr).minimize(loss=-log_prob)
def learn(self, s, a, td):
# a = a.reshape(a.shape[0], 1) # necessary for the episodes
feed_dict = {self.input: s, self.action: a}
p_matrix = self.sess.run(self.p_matrix, feed_dict)
log_prob, p_matrix = self.sess.run([self.log_prob, self.p_matrix], feed_dict)
_ = self.sess.run(self.train, feed_dict)
You can do that with tf.gather_nd
: 您可以使用tf.gather_nd
做到这tf.gather_nd
:
idx = tf.stack([tf.range(tf.shape(a)[0], dtype=a.dtype), a], axis=1)
p_selected = tf.gather_nd(p_matrix, idx)
Each row in idx
contains the "coordinates" of each element to retrieve, like [[0, a[0]], [1, a[1]], ...]
. idx
中的每一行都包含要检索的每个元素的“坐标”,例如[[0, a[0]], [1, a[1]], ...]
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.