简体   繁体   English

关于Theano扫描的一些问题

[英]Some questions about Theano scan

I am a little confused about theano scan mechanism, here is a simple code snippet to calculate A^k 我对theano扫描机制有点困惑,这里有一个简单的代码片段来计算A ^ k

import numpy
import theano
import theano.tensor as T 

def recurrence(pre_output, a):
    print("test")
    return pre_output*a

x = T.ivector('x')

o, updates = theano.scan(
    fn = recurrence,
    outputs_info = T.ones_like(x),
    non_sequences = x,
    n_steps = 5
)

fun = theano.function([x], o)
print(fun([1,2,3]))

I set print("test") in scan function, since n_steps is 5, the recurrence function should be called 5 times, My first thought was that "test" string should be printed 5 times. 我在扫描功能中设置了print(“test”),因为n_steps是5,重复功能应该被调用5次,我首先想到的是“test”字符串应该被打印5次。

But, as shown in the following output, "test" string just output 1 time. 但是,如下面的输出所示,“test”字符串只输出1次。

test
[[  1   2   3]
 [  1   4   9]
 [  1   8  27]
 [  1  16  81]
 [  1  32 243]]

So, a little confused, since recurrence function is called multiple (n_steps) times, why "test" string only output one time? 所以,有点困惑,因为递归函数被称为多次(n_steps)次,为什么“test”字符串只输出一次?

Any help would be much appreciated. 任何帮助将非常感激。

Thanks 谢谢

I won't make it too deep. 我不会太深入。

The mechanism of theano is that it builds a graph for calculation procedure, and does the math according to the graph in a sophisticated way we don't have to care. theano的机制是它为计算过程建立一个图表,并根据图表以复杂的方式进行数学计算,我们不必关心。 That's why theano can calculate gradient, because it models the calculation procedure before. 这就是theano可以计算梯度的原因,因为它之前对计算过程进行建模。

Here comes to the point, in theano.scan(fn=xxx,), by giving scan your fn, you tells theano how to build the graph instead of what to do in this loop . 这里有一点,在theano.scan(fn = xxx,)中,通过扫描你的fn,你告诉theano 如何构建图形而不是在这个循环中做什么

Theano will building the graph according to your fn but it will calculate it in it's own way instead of using your code. Theano将根据你的fn构建图形,但它将以自己的方式计算它,而不是使用你的代码。

So here is the conclusion: your code is only used once when scan trying to build the graph, then it's discarded and that explains why your "test" only appear ones. 所以这里得出结论:你的代码只在扫描试图构建图形时使用一次,然后它被丢弃,这就解释了为什么你的“测试”只出现了。

I hope this helps. 我希望这有帮助。

And here is a demo I created to explain, if you like you can try to read it. 这是我创建的演示解释,如果你喜欢,你可以尝试阅读它。 I come from China and hope my bad English does not make you uneasy. 我来自中国,希望我的英语不会让你感到不安。

https://gist.github.com/NickQianFeng/9b91f2ecaa4f7e5ddb89d1b50cac1576 https://gist.github.com/NickQianFeng/9b91f2ecaa4f7e5ddb89d1b50cac1576

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM