关于Theano扫描的一些问题

Question

I am a little confused about theano scan mechanism, here is a simple code snippet to calculate A^k 我对theano扫描机制有点困惑，这里有一个简单的代码片段来计算A ^ k

import numpy
import theano
import theano.tensor as T 

def recurrence(pre_output, a):
    print("test")
    return pre_output*a

x = T.ivector('x')

o, updates = theano.scan(
    fn = recurrence,
    outputs_info = T.ones_like(x),
    non_sequences = x,
    n_steps = 5
)

fun = theano.function([x], o)
print(fun([1,2,3]))

I set print("test") in scan function, since n_steps is 5, the recurrence function should be called 5 times, My first thought was that "test" string should be printed 5 times. 我在扫描功能中设置了print（“test”），因为n_steps是5，重复功能应该被调用5次，我首先想到的是“test”字符串应该被打印5次。

But, as shown in the following output, "test" string just output 1 time. 但是，如下面的输出所示，“test”字符串只输出1次。

test
[[  1   2   3]
 [  1   4   9]
 [  1   8  27]
 [  1  16  81]
 [  1  32 243]]

So, a little confused, since recurrence function is called multiple (n_steps) times, why "test" string only output one time? 所以，有点困惑，因为递归函数被称为多次（n_steps）次，为什么“test”字符串只输出一次？

Any help would be much appreciated. 任何帮助将非常感激。

Thanks 谢谢

Answer 1

I won't make it too deep. 我不会太深入。

The mechanism of theano is that it builds a graph for calculation procedure, and does the math according to the graph in a sophisticated way we don't have to care. theano的机制是它为计算过程建立一个图表，并根据图表以复杂的方式进行数学计算，我们不必关心。 That's why theano can calculate gradient, because it models the calculation procedure before. 这就是theano可以计算梯度的原因，因为它之前对计算过程进行建模。

Here comes to the point, in theano.scan(fn=xxx,), by giving scan your fn, you tells theano how to build the graph instead of what to do in this loop . 这里有一点，在theano.scan（fn = xxx，）中，通过扫描你的fn，你告诉theano 如何构建图形而不是在这个循环中做什么 。

Theano will building the graph according to your fn but it will calculate it in it's own way instead of using your code. Theano将根据你的fn构建图形，但它将以自己的方式计算它，而不是使用你的代码。

So here is the conclusion: your code is only used once when scan trying to build the graph, then it's discarded and that explains why your "test" only appear ones. 所以这里得出结论：你的代码只在扫描试图构建图形时使用一次，然后它被丢弃，这就解释了为什么你的“测试”只出现了。

I hope this helps. 我希望这有帮助。

And here is a demo I created to explain, if you like you can try to read it. 这是我创建的演示解释，如果你喜欢，你可以尝试阅读它。 I come from China and hope my bad English does not make you uneasy. 我来自中国，希望我的英语不会让你感到不安。

https://gist.github.com/NickQianFeng/9b91f2ecaa4f7e5ddb89d1b50cac1576 https://gist.github.com/NickQianFeng/9b91f2ecaa4f7e5ddb89d1b50cac1576

关于Theano扫描的一些问题

问题描述

1 个解决方案

解决方案1
0 2016-11-04 13:30:18

关于Theano扫描的一些问题

问题描述

1 个解决方案

解决方案1 0 2016-11-04 13:30:18

解决方案1
0 2016-11-04 13:30:18