简体   繁体   English

在 python 中使用 log(1+e^x) 的泰勒级数扩展 1 个暗向量

[英]expand 1 dim vector by using taylor series of log(1+e^x) in python

I need to non-linearly expand on each pixel value from 1 dim pixel vector with taylor series expansion of specific non-linear function ( e^x or log(x) or log(1+e^x) ), but my current implementation is not right to me at least based on taylor series concepts.我需要使用特定非线性 function ( e^x or log(x) or log(1+e^x) )的泰勒级数扩展从 1 个暗淡像素向量中对每个像素值进行非线性扩展,但我目前的实现至少基于泰勒级数概念,对我来说是不正确的。 The basic intuition behind is taking pixel array as input neurons for a CNN model where each pixel should be non-linearly expanded with taylor series expansion of non-linear function.背后的基本直觉是将像素阵列作为 CNN model 的输入神经元,其中每个像素都应该通过非线性 function 的泰勒级数展开进行非线性展开。

new update 1 :新更新1

From my understanding from taylor series, taylor series is written for a function F of a variable x in terms of the value of the function F and it's derivatives in for another value of variable x0 .根据我对泰勒级数的理解,泰勒级数是为变量x的 function F的值编写的 function F ,它是变量x0的另一个值的导数。 In my problem, F is function of non-linear transformation of features (aka, pixels), x is each pixel value, x0 is maclaurin series approximation at 0.在我的问题中, F是特征(又名像素)的非线性变换的 function, x是每个像素值, x0是麦克劳林级数近似为 0。

new update 2新的更新 2

if we use taylor series of log(1+e^x) with approximation order of 2, each pixel value will yield two new pixel by taking first and second expansion terms of taylor series.如果我们使用近似阶数为 2 的log(1+e^x)泰勒级数,则每个像素值将通过采用泰勒级数的第一个和第二个展开项产生两个新像素。

graphic illustration图解

Here is the graphical illustration of the above formulation:这是上述公式的图形说明:

在此处输入图像描述

Where X is pixel array, p is approximation order of taylor series, and α is the taylor expansion coefficient.其中X是像素阵列, p是泰勒级数的近似阶, α是泰勒展开系数。

I wanted to non-linearly expand pixel vectors with taylor series expansion of non-linear function like above illustration demonstrated.我想用非线性 function 的泰勒级数扩展来非线性扩展像素向量,如上图所示。

My current attempt我目前的尝试

This is my current attempt which is not working correctly for pixel arrays.这是我目前的尝试,它不适用于像素 arrays。 I was thinking about how to make the same idea applicable to pixel arrays.我在考虑如何使相同的想法适用于像素 arrays。

def taylor_func(x, approx_order=2):
    x_ = x[..., None] 
    x_ = tf.tile(x_, multiples=[1, 1, approx_order+ 1])  
    pows = tf.range(0, approx_order + 1, dtype=tf.float32) 
    x_p = tf.pow(x_, pows) 
    x_p_ = x_p[..., None]
    return x_p_

x = Input(shape=(4,4,3))
x_new = Lambda(lambda x: taylor_func(x, max_pow))(x)

my new updated attempt :我的新更新尝试

x_input= Input(shape=(32, 32,3))

def maclurin_exp(x, powers=2):
    out= 0
    for k in range(powers):
        out+= ((-1)**k) * (x ** (2*k)) / (math.factorial(2 * k))
    return res

x_input_new = Lambda(lambda x: maclurin_exp(x, max_pow))(x_input)

This attempt doesn't yield what the above mathematical formulation describes.这种尝试不会产生上述数学公式所描述的内容。 I bet I missed something while doing the expansion.我敢打赌我在扩展时错过了一些东西。 Can anyone point me on how to make this correct?谁能指出我如何纠正这个问题? Any better idea?有更好的主意吗?

goal目标

I wanted to take pixel vector and make non-linearly distributed or expanded with taylor series expansion of certain non-linear function.我想采用像素向量并通过某些非线性 function 的泰勒级数展开来进行非线性分布或展开。 Is there any possible way to do this?有没有办法做到这一点? any thoughts?有什么想法吗? thanks谢谢

This is a really interesting question but I can't say that I'm clear on it as of yet.这是一个非常有趣的问题,但我还不能说我对此很清楚。 So, while I have some thoughts, I might be missing the thrust of what you're looking to do.所以,虽然我有一些想法,但我可能会错过你想要做的事情的主旨。

It seems like you want to develop your own activation function instead of using something RELU or softmax.似乎您想开发自己的激活 function 而不是使用 RELU 或 softmax。 Certainly no harm there.那里当然没有坏处。 And you gave three candidates: e^x, log(x), and log(1+e^x) .你给了三个候选人: e^x, log(x), and log(1+e^x)

在此处输入图像描述

Notice log(x) asymptotically approaches negative infinity x --> 0. So, log(x) is right out.注意 log(x) 渐近地接近负无穷 x --> 0。所以,log(x) 是正确的。 If that was intended as a check on the answers you get or was something jotted down as you were falling asleep, no worries.如果这是为了检查你得到的答案,或者是在你入睡时记下的东西,不用担心。 But if it wasn't, you should spend some time and make sure you understand the underpinnings of what you doing because the consequences can be quite high.但如果不是,您应该花一些时间并确保您了解您所做的事情的基础,因为后果可能非常严重。

You indicated you were looking for a canonical answer and you get a two for one here.您表示您正在寻找一个规范的答案,并且您在这里得到一个二合一的答案。 You get both a canonical answer and highly performant code.你会得到一个规范的答案和高性能的代码。

Considering you're not likely to able to write faster, more streamlined code than the folks of SciPy, Numpy, or Pandas.考虑到您不可能比 SciPy、Numpy 或 Pandas 的人编写更快、更精简的代码。 Or, PyPy.或者,PyPy。 Or Cython for that matter.或者 Cython。 Their stuff is the standard.他们的东西是标准的。 So don't try to compete against them by writing your own, less performant (and possibly bugged) version which you will then have to maintain as time passes.因此,不要试图通过编写自己的、性能较差(并且可能存在错误)的版本来与它们竞争,然后随着时间的推移您将不得不对其进行维护。 Instead, maximize your development and run times by using them.相反,通过使用它们来最大化您的开发和运行时间。

Let's take a look at the implementation e^x in SciPy and give you some code to work with.让我们看一下 SciPy 中的实现e^x并为您提供一些可以使用的代码。 I know you don't need a graph for what you're at this stage but they're pretty and can help you understand how they Taylor (or Maclaurin, aka Euler-Maclaurin) will work as the order of the approximation changes.我知道您在这个阶段不需要图表,但它们很漂亮,可以帮助您了解 Taylor(或 Maclaurin,又名 Euler-Maclaurin)将如何随着近似顺序的变化而工作。 It just so happens that SciPy has Taylor approximation built-in.碰巧 SciPy 内置了泰勒近似。

import scipy
import numpy as np
import matplotlib.pyplot as plt

from scipy.interpolate import approximate_taylor_polynomial

x = np.linspace(-10.0, 10.0, num=100)

plt.plot(x, np.exp(x), label="e^x", color = 'black')

for degree in np.arange(1, 4, step=1):

    e_to_the_x_taylor = approximate_taylor_polynomial(np.exp, 0, degree, 1, order=degree + 2)

    plt.plot(x, e_to_the_x_taylor(x), label=f"degree={degree}")

plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.0, shadow=True)

plt.tight_layout()
plt.axis([-10, 10, -10, 10])
plt.show()

That produces this:这产生了这个:

在此处输入图像描述

But let's say if you're good with 'the maths', so to speak, and are willing to go with something slightly slower if it's more 'mathy' as in it handles symbolic notation well.但是,假设您对“数学”很好,可以这么说,并且愿意 go 如果它更“数学”,那么它会稍微慢一些,因为它可以很好地处理符号表示法。 For that, let me suggest SymPy.为此,让我推荐 SymPy。

And with that in mind here is a bit of SymPy code with a graph because, well, it looks good AND because we need to go back and hit another point again.考虑到这一点,这里有一些带有图形的 SymPy 代码,因为它看起来不错,而且因为我们需要返回 go 并再次触及另一个点。

from sympy import series, Symbol, log, E
from sympy.functions import exp
from sympy.plotting import plot
import matplotlib.pyplot as plt
%matplotlib inline

plt.rcParams['figure.figsize'] = 13,10
plt.rcParams['lines.linewidth'] = 2

x = Symbol('x')

def taylor(function, x0, n):
    """ Defines Taylor approximation of a given function
    function -- is our function which we want to approximate
    x0 -- point where to approximate
    n -- order of approximation
    """    
    return function.series(x,x0,n).removeO()

# I get eyestain; feel free to get rid of this
plt.rcParams['figure.figsize'] = 10, 8
plt.rcParams['lines.linewidth'] = 1

c = log(1 + pow(E, x))

plt = plot(c, taylor(c,0,1), taylor(c,0,2), taylor(c,0,3), taylor(c,0,4), (x,-5,5),legend=True, show=False)

plt[0].line_color = 'black'
plt[1].line_color = 'red'
plt[2].line_color = 'orange'
plt[3].line_color = 'green'
plt[4].line_color = 'blue'
plt.title = 'Taylor Series Expansion for log(1 +e^x)'
plt.show()

在此处输入图像描述

I think either option will get you where you need go.我认为任何一个选项都会让你到达你需要 go 的地方。

Ok, now for the other point.好的,现在谈谈另一点。 You clearly stated after a bit of revision that log(1 +e^x) was your first choice.经过一番修改后,您明确表示 log(1 +e^x) 是您的首选。 But the others don't pass the sniff test.但其他人没有通过嗅探测试。 e^x vacillates wildly as the degree of the polynomial changes. e^x 随着多项式次数的变化而剧烈波动。 Because of the opaqueness of algorithms and how few people can conceptually understand this stuff, Data Scientists can screw things up to a degree people can't even imagine.由于算法的不透明性以及很少有人能够从概念上理解这些东西,数据科学家可以把事情搞砸到人们甚至无法想象的程度。 So make sure you're very solid on theory for this.因此,请确保您对此理论非常扎实。

One last thing, consider looking at the CDF of the Erlang Distribution as an activation function (assuming I'm right and you're looking to roll your own activation function as an area of research).最后一件事,考虑查看 Erlang Distribution 的 CDF 作为激活 function (假设我是对的,并且您希望将自己的激活 ZC1C425268E68385D1AB5074C17A 的研究区域推出)。 I don't think anyone has looked at that but it strikes as promising.我认为没有人看过它,但它看起来很有希望。 I think you could break out each channel of the RGB as one of the two parameters, with the other being the physical coordinate.我认为您可以将 RGB 的每个通道分解为两个参数之一,另一个是物理坐标。

You can use tf.tile and tf.math.pow to generate the elements of the series expansion.您可以使用tf.tiletf.math.pow来生成级数展开的元素。 Then you can usetf.math.cumsum to compute the partial sums s_i .然后您可以使用tf.math.cumsum来计算部分和s_i Eventually you can multiply with the weights w_i and compute the final sum.最终,您可以乘以权重w_i并计算最终总和。

Here is a code sample:这是一个代码示例:

import math
import tensorflow as tf

x = tf.keras.Input(shape=(32, 32, 3))  # 3-channel RGB.

# The following is determined by your series expansion and its order.
# For example: log(1 + exp(x)) to 3rd order.
# https://www.wolframalpha.com/input/?i=taylor+series+log%281+%2B+e%5Ex%29
order = 3
alpha = tf.constant([1/2, 1/8, -1/192])  # Series coefficients.
power = tf.constant([1.0, 2.0, 4.0])
offset = math.log(2)

# These are the weights of the network; using a constant for simplicity here.
# The shape must coincide with the above order of series expansion.
w_i = tf.constant([1.0, 1.0, 1.0])

elements = offset + alpha * tf.math.pow(
    tf.tile(x[..., None], [1, 1, 1, 1, order]),
    power
)
s_i = tf.math.cumsum(elements, axis=-1)
y = tf.math.reduce_sum(w_i * s_i, axis=-1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM