While working with keras and tensorflow, I found the following lines of code confusing.
w_init = tf.random_normal_initializer()
self.w = tf.Variable(initial_value=w_init(shape=(input_dim, units),
dtype='float32'),trainable=True)
Also, I have seen something like:
Dense(64, activation='relu')(x)
Therefore, if Dense(...)
will create the object for me, then how can I follow that with with (x)
?
Likewise for w_init
above. How can I say such thing:
tf.random_normal_initializer()(shape=(input_dim, units), dtype='float32'),trainable=True)
Do we have such thing in python "ClassName()" followed by "()"
while creating an object such as a layer?
While I was looking into Closures in python, I found that a function can return another function. Hence, is this what really happens in Keras?
Any help is much appreciated!!
These are two totally different ways to define models.
Keras works with the concept of layers. Each line defines a full layer of your network. What you are referring to in specific is keras' functional API. The concept is to combine layers like this:
inp = Input(shape=(28, 28, 1))
x = Conv2D((6,6), strides=(1,1), activation='relu')(inp)
# ... etc ...
x = Flatten()(x)
x = Dense(10, activation='softmax')(x)
model = Model(inputs=[inp], outputs=[x])
This way you've created a full CNN in just a few lines. Note that you never had to manually input the shape of the weight vectors or the operations that are performed. These are inferred automatically by keras.
Now, this just needs to be compiled through model.compile(...)
and then you can train it through model.fit(...)
.
On the other hand TensorFlow is a bit more low-level. This means that you have do define the variables and operations by hand. So in order to write a fully-connected layer you'd have to do the following:
# Input placeholders
x = tf.placeholder(tf.float32, shape=(None, 28, 28, 1))
y = tf.placeholder(tf.float32, shape=(None, 10))
# Convolution layer
W1 = tf.Variable(tf.truncated_normal([6, 6, 1, 32], stddev=0.1))
b1 = tf.Variable(tf.constant(0.1, tf.float32, [32]))
z1 = tf.nn.conv2d(x_2d, W1, strides=[1, 1, 1, 1], padding='SAME') + b1
c1 = tf.nn.relu(z1)
# ... etc ...
# Flatten
flat = tf.reshape(p2, [-1, ...]) # need to calculate the ... by ourselves
# Dense
W3 = tf.Variable(tf.truncated_normal([..., 10], stddev=0.1)) # same size as before
b3 = tf.Variable(tf.constant(0.1, tf.float32, [10]))
fc1 = tf.nn.relu(tf.matmul(flat, W3) + b3)
Two things to note here. There is no explicit definition of a model
here and this has to be trained through a tf.Session
with a feed_dict
feeding the data to the placeholders. If you're interested you'll find several guides online.
TensorFlow has a much friendlier and easier way to define and train models through eager execution , which will be default in TF 2.0! So the code you posted is in a sense the old way of doing things in tensorflow. It's worth taking a look into TF 2.0, which actually recommends doing things the keras way!
Edit (after comment by OP):
No a layer is not a clojure . A keras layer is a class that implements a __call__
method which also makes it callable. The way they did it was so that it is a wrapper to the call
method that users typically write.
You can take a look at the implementation here
Basically how this works is:
class MyClass:
def __init__(self, param):
self.p = param
def call(self, x):
print(x)
If you try to write c = MyClass(1)(3)
, you'll get a TypeError saying that MyClass is not callable. But if you write it like this:
class MyClass:
def __init__(self, param):
self.p = param
def __call__(self, x):
print(x)
It works now. Essentially keras does it like this:
class MyClass:
def __init__(self, param):
self.p = param
def call(self, x):
print(x)
def __call__(self, x):
self.call(x)
So that when you write your own layer you can implement your own call
method and the __call__
method that wraps your one will get inherited from keras' base Layer class.
Just from the syntax, I would say that Dense()
returns a function (or more accurately a callable ). Similarly w_init
is a callable as well.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.