Julia神经网络代码的速度与PyPy相同

Question

So I have some neural network code in Python which I rewrote in Julia. 所以我有一些用Python写的神经网络代码，我用Julia重写了。 The straight Python code runs in about 7 seconds, while both the Julia and PyPy code run in about 0.75 seconds 直接的Python代码运行大约7秒，而Julia和PyPy代码运行大约0.75秒

sigmoid(z::Float64) = 1/(1 + exp(-z))
sigmoidPrime(z::Float64) = sigmoid(z) * (1 - sigmoid(z))

### Types ###

abstract AbstractNode

type Edge
    source::AbstractNode
    target::AbstractNode
    weight::Float64
    derivative::Float64
    augmented::Bool

    Edge(source::AbstractNode, target::AbstractNode) = new(source, target, randn(1,1)[1], 0.0, false)
end

type Node <: AbstractNode
    incomingEdges::Vector{Edge}
    outgoingEdges::Vector{Edge}
    activation::Float64
    activationPrime::Float64

    Node() = new([], [], -1.0, -1.0)
end

type InputNode <: AbstractNode
    index::Int
    incomingEdges::Vector{Edge}
    outgoingEdges::Vector{Edge}
    activation::Float64

    InputNode(index::Int) = new(index, [], [], -1.0)
end

type BiasNode <: AbstractNode
    incomingEdges::Vector{Edge}
    outgoingEdges::Vector{Edge}
    activation::Float64

    BiasNode() = new([], [], 1.0)
end

type Network
    inputNodes::Vector{InputNode}
    hiddenNodes::Vector{Node}
    outputNodes::Vector{Node}

    function Network(sizes::Array, bias::Bool=true)
        inputNodes = [InputNode(i) for i in 1:sizes[1]];
        hiddenNodes = [Node() for _ in 1:sizes[2]];
        outputNodes = [Node() for _ in 1:sizes[3]];

        for inputNode in inputNodes
            for node in hiddenNodes
                edge = Edge(inputNode, node);
                push!(inputNode.outgoingEdges, edge)
                push!(node.incomingEdges, edge)
            end
        end

        for node in hiddenNodes
            for outputNode in outputNodes
                edge = Edge(node, outputNode);
                push!(node.outgoingEdges, edge)
                push!(outputNode.incomingEdges, edge)
            end
        end

        if bias == true
            biasNode = BiasNode()
            for node in hiddenNodes
                edge = Edge(biasNode, node);
                push!(biasNode.outgoingEdges, edge)
                push!(node.incomingEdges, edge)
            end
        end

        new(inputNodes, hiddenNodes, outputNodes)
    end
end


### Methods ###

function evaluate(obj::Node, inputVector::Array)
    if obj.activation > -0.5
        return obj.activation
    else
        weightedSum = sum([d.weight * evaluate(d.source, inputVector) for d in obj.incomingEdges])
        obj.activation = sigmoid(weightedSum)
        obj.activationPrime = sigmoidPrime(weightedSum)

        return obj.activation
    end
end

function evaluate(obj::InputNode, inputVector::Array)
    obj.activation = inputVector[obj.index]
    return obj.activation
end

function evaluate(obj::BiasNode, inputVector::Array)
    obj.activation = 1.0
    return obj.activation
end

function updateWeights(obj::AbstractNode, learningRate::Float64)
    for d in obj.incomingEdges
        if d.augmented == false
            d.augmented = true
            d.weight -= learningRate * d.derivative
            updateWeights(d.source, learningRate)
            d.derivative = 0.0
        end
    end
end

function compute(obj::Network, inputVector::Array)
    output = [evaluate(node, inputVector) for node in obj.outputNodes]
    for node in obj.outputNodes
        clear(node)
    end
    return output
end

function clear(obj::AbstractNode)
    for d in obj.incomingEdges
        obj.activation = -1.0
        obj.activationPrime = -1.0
        d.augmented = false
        clear(d.source)
    end
end

function propagateDerivatives(obj::AbstractNode, error::Float64)
    for d in obj.incomingEdges
        if d.augmented == false
            d.augmented = true
            d.derivative += error * obj.activationPrime * d.source.activation
            propagateDerivatives(d.source, error * d.weight * obj.activationPrime)
        end
    end
end

function backpropagation(obj::Network, example::Array)
    output = [evaluate(node, example[1]) for node in obj.outputNodes]
    error = output - example[2]
    for (node, err) in zip(obj.outputNodes, error)
        propagateDerivatives(node, err)
    end

    for node in obj.outputNodes
        clear(node)
    end
end

function train(obj::Network, labeledExamples::Array, learningRate::Float64=0.7, iterations::Int=10000)
    for _ in 1:iterations
        for ex in labeledExamples
            backpropagation(obj, ex)
        end

        for node in obj.outputNodes
            updateWeights(node, learningRate)
        end

        for node in obj.outputNodes
            clear(node)
        end
    end
end


labeledExamples = Array[Array[[0,0,0], [0]],
                        Array[[0,0,1], [1]],
                        Array[[0,1,0], [0]],
                        Array[[0,1,1], [1]],
                        Array[[1,0,0], [0]],
                        Array[[1,0,1], [1]],
                        Array[[1,1,0], [1]],
                        Array[[1,1,1], [0]]];

neuralnetwork = Network([3,4,1])
@time train(neuralnetwork, labeledExamples)

I haven't provided the Python code because I'm not sure it's necessary (however I will if you really want it), I'm certainly not expecting one to spend lots of time making complete sense of this code, I'm just basically looking for glaring/systematic inefficiencies related to proper Julia implementation (as opposed to the algorithm itself). 我没有提供Python代码，因为我不确定是否有必要（但是，如果您真的想要的话，我会这样做），我当然不希望有人花很多时间来完全理解这些代码，我只是基本上是寻找与正确的Julia实施（与算法本身相反）相关的明显/系统上的低效率。

My motivation for doing this is that designing a neural network this way is so much more natural than vectorizing the algorithm and using Numpy, but of course all that looping and jumping around the class structures is slow in Python. 我这样做的动机是，这种方式设计的神经网络比矢量化算法和使用Numpy自然得多，但是当然，在Python中绕过类结构的所有循环都很慢。

Thus this seemed like a natural choice for porting to Julia and seeing if I couldn't get some major speed ups, and while an order of magnitude speed up over straight Python is cool, what I was really hoping for was an order of magnitude speed up over PyPy (some benchmarks I found online seemed to suggest this was a reasonable expectation). 因此，这似乎是移植到Julia并查看我是否无法获得重大提速的自然选择，并且虽然在直接的Python上提速一个数量级很酷，但我真正希望的是一个数量级的提速超过了PyPy（我在网上发现的一些基准测试似乎表明这是一个合理的期望）。

Note : This has to be run in Julia 0.3 to work 注意：必须在Julia 0.3中运行才能正常工作

Answer 1

This does seem like more of a code review than a question (there aren't any question marks), but I'll take a crack at it anyway. 这似乎更像是代码审查，而不是一个问题（没有任何问号），但是无论如何我还是会加以破解。 The only obvious potential performance issue is that you're allocating arrays via comprehensions in evaluate , compute and backpropagation . 唯一明显的潜在性能问题是，您需要通过evaluate ， compute和backpropagation理解来分配数组。 That weighted sum computation in evaluate would be much more efficient as a for loop. evaluate加权和计算作为for循环会更加有效。 For the other two methods, you may want to use pre-allocated arrays instead of comprehensions. 对于其他两种方法，您可能要使用预分配的数组而不是理解。 You can use Julia's built-in profiler to see where your code is spending most of its time – that may reveal some non-obvious hot spots that you can further optimize. 您可以使用Julia的内置探查器来查看代码大部分时间都花在了哪里–这可能会揭示一些非显而易见的热点，您可以进一步优化这些热点。

Regarding the comparison to PyPy, it's quite possible that both Julia and PyPy are doing very well with this code – at or near C performance – in which case you wouldn't expect Julia to be much faster than PyPy, since they're both close to optimal. 关于与PyPy的比较，Julia和PyPy很有可能都在此代码上做得很好-达到或接近C性能-在这种情况下，您不会期望Julia会比PyPy快得多，因为它们都接近达到最佳。 Comparing to the performance of a C implementation would be very informative since it would show how much performance is being left on the table by both Julia and PyPy. 与C实现的性能进行比较将非常有用，因为它可以显示Julia和PyPy剩下多少性能。 Fortunately, this code seems like it would be pretty straightforward to port to C. 幸运的是，此代码似乎很容易移植到C。

Julia神经网络代码的速度与PyPy相同

问题描述

1 个解决方案

解决方案1
5 2014-08-10 03:33:23

Julia神经网络代码的速度与PyPy相同

问题描述

1 个解决方案

解决方案1 5 2014-08-10 03:33:23

解决方案1
5 2014-08-10 03:33:23