Consider a (dense) layer of 10 units. I would like now to add another (single) unit to this layer. But how do I make sure that the previous 10 weights are not trained, and only the new one gets trained?
You can pass a list of variables to different optimizers. One for training the first 10 units the other one for training the other one in conjunction with a tf.cond()
so you can have two branches in your graph for when you want to use only 10 or 11 neurons. You can also play with tf.stop_gradient()
it really depends on your use case.
Without more informations it is difficult to answer but somewhere along the lines of:
import tensorflow as tf
choose_branch=tf.placeholder(tf.bool)
w_old=tf.Variable(np.random.normal(size=(1,10)).astype("float32"))
b=tf.Variable(np.zeros((1)).astype("float32"))
x=tf.placeholder(tf.float32,size=[None,1])
target=tf.placeholder(tf.float32,size=[None,1])
hidden_old=tf.nn.relu(tf.matmul(x,w_old)+b)
w_proj_old=tf.Variable(np.random.normal(size=(10,1)).astype("float32")
y_old=tf.matmul(hidden_old, w_proj_old)
cost_old=tf.reduce_mean(tf.square(y_old-target))
w_plus=tf.Variable(np.random.normal(size=(1,1)).astype("float32"))
w_proj_plus=tf.Variable(np.random.normal(size=(1,1)).astype("float32")
w_proj_new=tf.concat([w_proj_old,w_proj_plus],axis=1)
w_new=tf.concat([w_old,w_plus],axis=1)
hidden_new=tf.nn.relu(tf.matmul(x,w_new,axis=1))+b))
y_new=tf.matmul(hidden_new, w_proj_new)
cost_new=tf.reduce_mean(tf.square(y_new-target))
opt_old=tf.train.GradientDescentOptimizer(0.001)
opt_new=tf.train.GradientDescentOptimizer(0.0001)
train_step_old=opt_old.minimize(var_list=[w_old,b,w_proj_old,b])
train_step_new=opt_new.minimize(var_list=[w_plus,w_proj_plus])
y=tf.cond(choose_branch,lambda: y_old,lambda: y_new)
Careful the code was not tested.
To do this kind of advanced operations I find it easier to switch to the lower API of tensorflow.
A fully connected layer is usually defined as a matrix multiplication. For example, suppose your previous layer has 128
features, and you want to implement a fully connected layer with 256
features. You could write
batch_size = 128
num_in = 128
num_out = 256
h = tf.zeros((batch_size, num_in)) # actually, the output of the previous layer
# create the weights for the fully connected layer
w = tf.get_variable('w', shape=(num_in, num_out))
# compute the output
y = tf.matmul(h, w)
# -- feel free to add biases and activation
Now let's suppose you have trained w
and want to add some extra neurons to this layer. You could create an extra variable holding the extra weights, and concatenate it with the existing one.
num_out_extra = 10
# now w is set to trainable=False, we don't want its values to change
w = tf.get_variable('w', shape=(num_in, num_out), trainable=False)
# our new weights
w_extra = tf.get_variable('w_extra', shape=(num_in, num_out_extra))
w_total = tf.concat([w, w_extra], axis=-1)
y = tf.matmul(h, w_total)
# now y has 266 features
You will need to initialize all the weights one way or the other of course.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.