简体   繁体   English

用keras,tensorflow和python编写这个充满异国情调的NN架构

[英]Writing this exotic NN architecture with keras, tensorflow and python

I'm trying to get Keras to train a multiclass classification model that can be written in a network like this: 我试图让Keras训练一个可以在这样的网络中编写的多类分类模型:

在此输入图像描述 The only set of trainable parameters are those 唯一可训练的参数是那些 在此输入图像描述 , all the rest is given. ,其余的都给出了。 The functions fi are combinations of usual mathematical functions (for example 函数fi是通常数学函数的组合(例如 在此输入图像描述 .Sigma stands for summing the previous terms and softmax is the usual function. .Sigma代表以前术语的总和,softmax是通常的功能。 The (x1,x2,...xn) are elements of train or test set and (x1,x2,... xn)是火车或测试装置的元素 分子 are a specific subset of the original data already selected. 是已选择的原始数据的特定子集。

The model in more depth: 该模型更深入:

Specificaly, given (x_1,x_2,...,x_n) an input in train or test set, the network evaluates 具体而言,给定(x_1,x_2,...,x_n)列车或测试集中的输入,网络进行评估

在此输入图像描述

where fi are given mathematical functions, fi给出数学函数, 分子 are rows of a particular subset of the original data and the coefficients 是原始数据和系数的特定子集的行 在此输入图像描述 are the parameters I want to train. 是我想训练的参数。 As I'm using keras, I expect it to add a bias term to each row. 当我使用keras时,我希望它为每一行添加一个偏差项。

After the above evaluation, I will apply a softmax layer (each of the m lines above are numbers that will be inputs for the softmax function). 在上述评估之后,我将应用softmax层(上面m行中的每一行都是将作为softmax函数的输入的数字)。

At the end I want to compile the model and run model.fit as usual. 最后,我想编译模型并像往常一样运行model.fit。

The problem is that I couln't translate the expression to keras sintax. 问题是我不能将表达翻译成keras sintax。

My attempt: 我的尝试:

Following the network scratch above, I first tried to consider each of the expressions of the form 在上面的网络划痕之后,我首先尝试考虑表单的每个表达式 行 as lambda layers in a Sequential Model, but the best I could get to work was a combination of a dense layer with linear activation (which would play the role of a row's parameters: 作为序列模型中的lambda层,但我能够工作的最好的是密集层与线性激活的组合(它将扮演行参数的角色: 参数 ) followed by a Lambda layer outputting a vector )后面是输出矢量的Lambda层 向量 without the required summation, as follows: 没有所需的总和,如下:

model = Sequential()
#single row considered:
model.add(Lambda(lambda x:  f_fixedRow(x), input_shape=(nFeatures,))) 
#parameters set after lambda layer to get (a1*f(x1,y1),...,an*f(xn,yn)) and not (f(a1*x1,y1),...,f(an*xn,yn))
model.add(Dense(nFeatures, activation='linear')) 

#missing summation: sum(x)
#missing evaluation of f in all other rows

model.add(Dense(classes,activation='softmax',trainable=False)) #should get all rows
model.compile(optimizer='sgd',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

Also, I had to define the function in the lambda function call with the 另外,我必须在lambda函数调用中定义函数 分子 argument already fixed (because the lambda function could have only the input layers as variable): 参数已经修复(因为lambda函数只能将输入层作为变量):

def f_fixedRow(x):
   #picking a particular row (as a vector) to evaluate f in (f works element-wise)
   y=tf.constant(value=x[0,:],dtype=tf.float32) 
   return f(x,y)

I managed to write the f function with tensorflow (working element-wise in a row), although this is a possible source for problems in my code (and the above workaround seems unnatural). 我设法用tensorflow编写f函数(连续工作元素),虽然这可能是我的代码中出现问题的原因(上面的解决方法似乎不自然)。

I also thought that if I could properly write the element-wise sum of the vector in the aforementioned attempt I could repeat the same procedure in a parallelized manner with the keras Functional API and then insert the output of each parallel model in a softmax function, as I need. 我还认为如果我能够在上述尝试中正确地写出向量的元素和,我可以用keras Functional API以并行方式重复相同的过程,然后在softmax函数中插入每个并行模型的输出,因为我需要。

Another approach that I considered was to train the parameters keeping their natural matrix structure seen in Network Description , maybe writing a matrix Lambda layer, but I could not find anything related to this idea. 我考虑的另一种方法是训练参数保持其在网络描述中看到的自然矩阵结构,也许写一个矩阵Lambda层,但我找不到任何与这个想法相关的东西。

Anyway, I'm not sure what is a good way to work with this model within keras, maybe I'm missing an important point because of the non standard way the parameters are written or lack of experience with tensorflow. 无论如何,我不确定在keras中使用这个模型的好方法是什么,也许我错过了重要的一点,因为参数编写的非标准方式或缺乏​​tensorflow的经验。 Any suggestions are welcome. 欢迎任何建议。

For this answer, it's important that f be a tensor function that operates elementwise. 对于这个答案,重要的是f是一个按元素运算的张量函数。 (No iterating). (没有迭代)。 This is reasonably easy to have, just check the keras backend functions . 这很容易,只需检查keras后端功能

Assumptions: 假设:

  • The x_pk set is constant, otherwise this solution must be reviewed. x_pk集是常量,否则必须检查此解决方案。
  • The function f is elementwise (if not, please show f for better code) 函数f是元素(如果没有,请显示f以获得更好的代码)

Your model will need x_pk as a tensor input . 您的模型需要x_pk作为张量输入 And you should do that in a functional API model. 您应该在功能API模型中执行此操作。

import keras.backend as K
from keras.layers import Input, Lambda, Activation
from keras.models import Model

#x_pk data
x_pk_numpy = select_X_pk_samples(x_train)
x_pk_tensor = K.variable(x_pk_numpy)

#number of rows in x_pk
m = len(x_pk_numpy)

#I suggest a fixed batch size for simplicity
batch = some_batch_size

First let's work on the function that will take x and x_pk calling f . 首先让我们处理将xx_pk调用f

def calculate_f(inputs): #inputs will be a list with x and x_pk
    x, x_pk = inputs

    #since f will work elementwise, let's replicate x and x_pk so they have equal shapes 
    #please explain f for better optimization

    # x from (batch, n) to (batch, m, n)
    x = K.stack([x]*m, axis=1)

    # x_pk from (m, n) to (batch, m, n)
    x_pk = K.stack([x_pk]*batch, axis=0)
        #a batch size of 1 could make this even simpler    
        #a variable batch size would make this more complicated
        #certain f functions could make this process unnecessary    

    return f(x, x_pk)

Now, different from a Dense layer, this formula is using the a_pk weights multiplied elementwise. 现在,与Dense图层不同,此公式使用a_pk权重乘以元素。 So we need a custom layer: 所以我们需要一个自定义图层:

class ElementwiseWeights(Layer):
    def __init__(self, **kwargs):
        super(ElementwiseWeights, self).__init__(**kwargs)

    def build(self, input_shape):
        weight_shape = (1,) + input_shape[1:] #shape (1, m, n)

        self.kernel = self.add_weight(name='kernel', 
                                  shape=weight_shape,
                                  initializer='uniform',
                                  trainable=True)

        super(ElementwiseWeights, self).build(input_shape)  

    def compute_output_shape(self,input_shape):
        return input_shape

    def call(self, inputs):
        return self.kernel * inputs

Now let's build our functional API model: 现在让我们构建我们的功能API模型:

#x_pk model tensor input
x_pk = Input(tensor=x_pk_tensor) #shape (m, n)

#x usual input with fixed batch size
x = Input(batch_shape=(batch,n))  #shape (batch, n)

#calculate F
out = Lambda(calculate_f)([x, xp_k]) #shape (batch, m, n)

#multiply a_pk
out = ElementwiseWeights()(out) #shape (batch, m, n)

#sum n elements, keep m rows:
out = Lambda(lambda x: K.sum(x, axis=-1))(out) #shape (batch, m)

#softmax
out = Activation('softmax')(out) #shape (batch,m)

Continue this model with whatever you want and finish it: 用你想要的任何东西继续这个模型并完成它:

model = Model([x, x_pk], out)
model.compile(.....)
model.fit(x_train, y_train, ....) #perhaps you might need .fit([x_train], ytrain,...)

Edit for function f 编辑功能f

You can have the proposed f like this: 您可以像这样建议f

#create the n coefficients:
coefficients = np.array([c0, c1, .... , cn])
coefficients = coefficients.reshape((1,1,n))

def f(x, x_pk):

    c = K.variable(coefficients) #shape (1, 1, n)
    out = (x - x_pk) / c
    return K.exp(out)
  • This f would accept x with shape (batch, 1, n) , without the stack used in the calculate_f function. 这个f将接受带有形状(batch, 1, n) x ,而不使用calculate_f函数中使用的stack
  • Or could accept x_pk with shape (1, m, n) , allowing variable batch size. 或者可以接受x_pk的形状(1, m, n) ,允许变量批量大小。

But I'm not sure it's possible to have both of these shapes together. 但我不确定是否可以将这两种形状结合在一起。 Testing this might be interesting. 测试这可能很有趣。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM