Why this simple Go program is slower than its Node.js counterpart?

Question

I'm attempting to use Go to implement a binary tree with values on the leaf, ie, equivalent to:

data Tree a 
  = Node {left: Tree, right: Tree} 
  | Leaf {value: a}

I had two problems: 1, I couldn't figure a way to make a type with multiple constructors, so I had to fit all data in one. 2, I couldn't make it polymorphic, so I had to use interface{} (which I guess is an "opt-out" of the type-system?). This is the best I could make:

package main

import ("fmt")

type Tree struct {
  IsLeaf bool
  Left *Tree
  Value interface{}
  Right *Tree
}

func build(n int) *Tree {
  if (n == 0) {
    return &Tree{IsLeaf: true, Left: nil, Value: 1, Right: nil}
  } else {
    return &Tree{IsLeaf: false, Left: build(n - 1), Value: 0, Right: build(n - 1)}
  }
}

func sum(tree *Tree) int {
  if (tree.IsLeaf) {
    return tree.Value.(int)
  } else {
    return sum(tree.Left) + sum(tree.Right)
  }
}

func main() {
  fmt.Println(sum(build(23)))
}

It implements the type and tests it by summing over a huge generated tree. I've proceeded to make an equivalent implementation in JavaScript (including the redundant data on constructors, for fairness):

const build = n => {
  if (n === 0) {
    return {IsLeaf: true, Value: 1, Left: null, Right: null};
  } else {
    return {IsLeaf: false, Value: 0, Left: build(n - 1), Right: build(n - 1)};
  }
}

const sum = tree => {
  if (tree.IsLeaf) {
    return tree.Value;
  } else {
    return sum(tree.Left) + sum(tree.Right);
  }
}

console.log(sum(build(23)));

I've compiled the Go code with go build test.go and ran it with time ./test . I've ran the Node.js code with node test.js . After several tests, the Go program ran in about 2.5 seconds in average, versus 1.0 seconds of the Node.js one.

That makes Go 2.5x slower than Node.js for this simple program, which can't be correct, given that Go is a statically-typed, compiled language with a mature compiler, whereas JavaScript is an untyped, interpreted one.

Why is my Go program so slow? Am I missing some compiler flag, or is the code problematic?

Answer 1

Summary

This code is slower because of the type assertion, and reduntant data.

Go doesn't encourage you to write type assertions in hot places:

tree.Value.(int)

Take out this type assertion (and accordingly change Value to an int type), and your code will perform about twice as fast (which should be around the speed of your node example).

Take out the redundant data as well, and your code will perform about three times as fast. See the playground example at the end of the post.

Details

I think this is a mistake of design, rather than implementation. Reading your question, I think there is some confusion about how Go's type system works.

Go's object model doesn't encourage you to do polymorphism using catch-all types (see the top half of this excellent answer for a discussion of Go's polymorphism).

In a JavaScript world, each object is a specific type. In Go, a struct can be treated as a specific interface type if it fulfils the interface 's contract. Note that structs are not objects - what you called constructors are just struct initialisers.

It is possible to write Go code that operates on interface{} as a placeholder for all types, but the language doesn't really encourage you to write code this way (as you pointed out in your question, it was a challenge to write clean code in the way you would write it in JavaScript).

Because Go doesn't really have objects, trying to write code that feels very object-oriented in Go will be challenging (additionally, Go doesn't have standard inheritance or method overloading). For this reason, I don't think that your code is the kind of code that Go encourages the programmer to write. So, it's not a fair test.

Type assertion is slow . (I'm not across the design of Go's internals, but certainly this indicates that the programmer is not expected to write a lot of type assertions). Because of this, it's not surprising that your code is not performant. I changed your code to:

type Tree struct {
  IsLeaf bool
  Left *Tree
  Value int
  Right *Tree
} 
 .....
func sum(tree *Tree) int {
  if (tree.IsLeaf) {
    return tree.Value
  } else {
    return sum(tree.Left) + sum(tree.Right)
  }
}

And achieved a 2x speed up on my machine.

There are probably other optimisations - you might be able to remove IsLeaf , and you don't need to store values at non-leaf nodes (or alternatively, you could distribute values throughout the tree, so never waste the Value ). I don't know whether JavaScript optimises out these unnecessary Value s, but I don't believe Go does.

So, I think your code is using much more memory than it needs, which won't help performance either.

Does it matter?

I'm not personally convinced by "I wrote this program in X and Y, and found that Y was slower", especially as it's hard to compare fairly across frameworks. There are so many other sources of variance - programmer knowledge, machine load, spin-up time, etc.

To do a fair test you'd need to write code that's idiomatic in each language, but also use the same code. I don't think it's realistic to achieve both.

If this code is your specific scenario, and performance is the primary goal, then this test might be helpful. But, otherwise I don't think it's a very meaningful comparison.

At scale, I would expect other considerations to beat how fast you can create and traverse a tree. There are technical concerns like data throughput and performance under load, but also softer concerns like programmer time, and maintenance effort.

The academic exercise is interesting, though. And writing code like this is a good way to find the edges of a framework.

Edit: I tried making your code more Go-like, which has the added advantage of a 3x speedup over the original.:

https://play.golang.org/p/mWaO3WR6pw

The tree is a bit heavy for the playground, but you can copy and paste the code to run locally.

There are more optimisations possible that I haven't tried, such as parallel construction of the tree.

You may be able to extend this design to have the polymorphic behaviour that you want (by providing alternative Leaf implementations), but I'm not sure what Sum() means for non-number types. Not knowing how to define Sum() is a good example of the kind of thinking that leads to not deciding to include polymorphism through generics.

Answer 2

I thought that this might be beneficial. This is my implementation of a balanced binary tree, which uses recursion, go routines, and channels. It was meant to be used as a package, which is why i use Exported and Un-exported functions. The exported functions are what you should use/mod etc.. I wrote it a long time ago... there are plenty of things that could have been written better.. I added a Sum function just now however for you. I added 23 nodes and got the sum in 1/4 a second..

UPDATE I've added a new function called GetTreeTotal() if you look at the Tree struct i keep a Total field now. In the Add() function I update that field as the node is being added. Now sum() doesn't have to be calculated in mass, thats just part is the Tree's meta data now.. So in that sense. super fast. Using similar logic, number of nodes on the tree can be kept as meta data as well.. Knowing that information, would speed up functions like TreeToArray() because one could then define the size of the slice before hand. Less allocations.. etc

UPDATE2 This question got me curious, I rewrote the code below and turned it into a package. https://github.com/marcsantiago/GoTree Iterative inserts are almost 3x faster (benchmarks included), though you really see this difference when the amount of inserts is really high.

package main

import (
    "encoding/json"
    "errors"
    "fmt"
    "math/rand"
    "sync"
    "time"
)

type node struct {
    Left  *node
    Right *node
    Data  int
}

// Tree ...
type Tree struct {
    Root  *node
    Total int
}

// FindNode ...
func (t *Tree) FindNode(data int) bool {
    newNode := node{
        Data: data,
    }
    if t.Root != nil {
        if t.findNode(t.Root, newNode) != nil {
            return true
        }
    }
    return false
}

func (t *Tree) findNode(search *node, target node) *node {
    var returnNode *node
    if search == nil {
        return returnNode
    }
    if search.Data == target.Data {
        return search
    }
    returnNode = t.findNode(search.Left, target)
    if returnNode == nil {
        returnNode = t.findNode(search.Right, target)
    }
    return returnNode
}

// Add ...
func (t *Tree) Add(data int) {
    t.Total += data
    if data < 0 {
        panic(errors.New("Only submit positive integers"))
    }
    nodeToAdd := node{
        Data: data,
    }
    if t.Root == nil {
        t.Root = new(node)
    }
    if t.Root.Data == 0 {
        t.Root = &nodeToAdd
        return
    }

    t.add(t.Root, nodeToAdd)
    return
}

func (t *Tree) add(oldnode *node, newNode node) {
    if newNode.Data < oldnode.Data {
        if oldnode.Left == nil {
            // t.Total += newNode.Data
            oldnode.Left = &newNode
        } else {
            // t.Total += newNode.Data
            t.add(oldnode.Left, newNode)
        }
    } else if newNode.Data > oldnode.Data {
        if oldnode.Right == nil {
            // t.Total += newNode.Data
            oldnode.Right = &newNode
        } else {
            // t.Total += newNode.Data
            t.add(oldnode.Right, newNode)
        }
    }
    return
}

// InOrderTraversal ...
func (t *Tree) InOrderTraversal() {
    if t.Root != nil {
        currentNode := t.Root
        if currentNode.Left == nil && currentNode.Right == nil {
            fmt.Println(currentNode.Data)
        } else {
            t.inOrderTraversal(currentNode)
        }
    }
    return
}

func (t *Tree) inOrderTraversal(n *node) {
    if n.Left != nil {
        t.inOrderTraversal(n.Left)
    }
    fmt.Println(n.Data)
    if n.Right != nil {
        t.inOrderTraversal(n.Right)
    }
    return
}

// Traversal ...
func (t *Tree) Traversal() {
    if t.Root != nil {
        currentNode := t.Root
        if currentNode.Left == nil && currentNode.Right == nil {
            fmt.Println(currentNode.Data)
        } else {
            t.traversal(currentNode)
        }
    }
    return
}

func (t *Tree) traversal(n *node) {
    fmt.Println(n.Data)
    if n.Left != nil {
        t.traversal(n.Left)
    }

    if n.Right != nil {
        t.traversal(n.Right)
    }
    return
}

// Sum ...
func (t *Tree) Sum() (total int) {
    var wg sync.WaitGroup
    c := make(chan int, 100)
    if t.Root != nil {
        currentNode := t.Root
        if currentNode.Left == nil && currentNode.Right == nil {
            return 1
        }
        wg.Add(1)
        t.sum(currentNode, c, &wg)
    }
    go func() {
        wg.Wait()
        close(c)
    }()
    for n := range c {
        total += n
    }
    return total
}

func (t *Tree) sum(n *node, counter chan int, wg *sync.WaitGroup) {
    defer wg.Done()

    if n.Left != nil {
        wg.Add(1)
        go t.sum(n.Left, counter, wg)
    }

    counter <- n.Data

    if n.Right != nil {
        wg.Add(1)
        go t.sum(n.Right, counter, wg)
    }

    return
}

// CountEdges ...
func (t *Tree) CountEdges() (edges int) {
    c := make(chan int, 10)
    if t.Root != nil {
        currentNode := t.Root
        if currentNode.Left == nil && currentNode.Right == nil {
            return 1
        }
        t.countEdges(currentNode, c)
    }

    for {
        n := <-c
        if n == 0 {
            close(c)
            break
        }
        edges++
    }
    return edges
}

func (t *Tree) countEdges(n *node, counter chan int) {
    if n.Left != nil {
        go t.countEdges(n.Left, counter)
    }

    if n.Left == nil && n.Right == nil {
        counter <- 0
    } else {
        counter <- 1
    }

    if n.Right != nil {
        go t.countEdges(n.Right, counter)
    }
    return
}

// GenerateRandomTree ...
func (t *Tree) GenerateRandomTree() {
    u := time.Now()
    source := rand.NewSource(u.Unix())
    r := rand.New(source)
    arr := r.Perm(1000)
    for _, a := range arr {
        t.Add(a)
    }
    return
}

// GetRootData ...
func (t *Tree) GetRootData() int {
    return t.Root.Data
}

// GetTreeTotal ...
func (t *Tree) GetTreeTotal() int {
    return t.Total
}

// TreeToArray ...
func (t *Tree) TreeToArray() []int {
    ch := make(chan int, 10)
    arr := []int{}
    if t.Root != nil {
        currentNode := t.Root
        if currentNode.Left == nil && currentNode.Right == nil {
            return []int{currentNode.Data}
        }
        t.traversalGetVals(currentNode, ch)
    }

    for {
        n := <-ch
        if n == -1 {
            close(ch)
            break
        }
        arr = append(arr, n)
    }
    return arr
}

func (t *Tree) traversalGetVals(n *node, ch chan int) {
    if n.Left != nil {
        ch <- n.Left.Data
        go t.traversalGetVals(n.Left, ch)
    }

    if n.Right != nil {
        ch <- n.Right.Data
        go t.traversalGetVals(n.Right, ch)
    }
    if n.Left == nil && n.Right == nil {
        ch <- -1
    }
    return
}

// ShiftRoot ...
func (t *Tree) ShiftRoot(newRoot int) {
    arr := t.TreeToArray()
    n := Tree{}
    n.Add(newRoot)
    for _, i := range arr {
        n.Add(i)
    }
    *t = n
}

// PrintTree ...
func (t *Tree) PrintTree() {
    b, err := json.MarshalIndent(t, "", " ")
    if err != nil {
        panic(err)
    }
    fmt.Println(string(b))
}

func main() {
    // t := Tree{}
    // t.GenerateRandomTree()
    // t.PrintTree()
    // fmt.Println("total:", t.Sum())

    t := Tree{}
    t.Add(10)
    t.Add(100)
    t.Add(2)
    t.Add(3)

    fmt.Println(t.Sum()) // should be 115
    fmt.Println(t.GetTreeTotal())

    // t := Tree{}
    // for i := 1; i <= 23; i++ {
    //  t.Add(i)
    // }
    // fmt.Println("total:", t.Sum())

}

Answer 3

The problem is mainly in the fragmented memory allocation (via the recursive stack). This causes a lot of small allocations and subsequently the garbage collector has a hefty job. You can circumvent this by pre allocate an array that holds all nodes and keep a running index for assignment:

bar.go

package bar

type Tree struct {
    Left  *Tree
    Value int
    Right *Tree
    IsLeaf bool
}

func build(level int, curridx *int, src *[]Tree) *Tree {
    if level == 0 {
        (*src)[*curridx] = Tree{Left: nil, Value: 1, Right: nil, IsLeaf:true}
        *curridx++
        return &(*src)[*curridx-1]
    } else {
        (*src)[*curridx] = Tree{Left: build(level-1, curridx, src), Value: 1, Right: build(level-1, curridx, src)}
        *curridx++
        return &(*src)[*curridx-1]
    }
}

func sum(tree *Tree) int {
    if (tree.IsLeaf) {
        return tree.Value.(int)
    } else {
        return sum(tree.Left) + sum(tree.Right)
    }
}

bar_test.go

package bar

import "testing"
import "math"

func TestMe(t *testing.T) {
    for x := 0; x < 10; x++ {
        levels := 23
        nrnodes := int(math.Pow(2.0, float64(levels+1))) //there are actually 24 levels
        mapping := make([]Tree, nrnodes, nrnodes)
        index := 0
        t.Error(sum(build(levels, &index, &mapping)))
    }
}

This will speed things up to 0.5 sec per iteration.

Note the build in profiling of this:

go test -cpuprofile cpu.out and go tool pprof cpu.out + web

Why this simple Go program is slower than its Node.js counterpart?

Question

3 answers

solution1
10 ACCPTED 2017-06-20 05:53:39

Summary

Details

Does it matter?

solution2
3 2017-06-20 03:00:25

solution3
1 2017-06-20 18:52:34

Why this simple Go program is slower than its Node.js counterpart?

Question

3 answers

solution1 10 ACCPTED 2017-06-20 05:53:39

Summary

Details

Does it matter?

solution2 3 2017-06-20 03:00:25

solution3 1 2017-06-20 18:52:34

solution1
10 ACCPTED 2017-06-20 05:53:39

solution2
3 2017-06-20 03:00:25

solution3
1 2017-06-20 18:52:34