简体   繁体   中英

pass const pointer of large struct to a function or a go channel

How to pass const pointer of large struct to a function or a go channel. Purpose of this ask is:

  1. Avoid the accidental modification of pointer by the function
  2. Avoid the copy of the struct object while passing to function/channel

This functionality is very common in C++, C#, Java, but how can we achieve the same in golang?

============== Update 2 ===================

Thank you @zarkams, @mkopriva and @peterSO It was the compiler optimization causing the same result in both byValue() and byPointer(). Modified the functions byValue() and byPointer() by adding data.array[0] = reverse(data.array[0]) , just to make compiler not to make the functions inline.

func byValue(data Data) int {
    data.array[0] = reverse(data.array[0])
    return len(data.array)
}

func byPointer(data *Data) int {
    data.array[0] = reverse(data.array[0])
    return len(data.array)
}

func reverse(s string) string {
    runes := []rune(s)
    for i, j := 0, len(runes)-1; i < j; i, j = i+1, j-1 {
        runes[i], runes[j] = runes[j], runes[i]
    }
    return string(runes)
}

After that running the benchmarks, passing by pointer was much efficient than passing by value.

C:\Users\anikumar\Desktop\TestGo>go test -bench=.
goos: windows
goarch: amd64
BenchmarkByValue-4         18978             58228 ns/op               3 B/op          1 allocs/op
BenchmarkByPointer-4    40034295                33.1 ns/op             3 B/op          1 allocs/op
PASS
ok      _/C_/Users/anikumar/Desktop/TestGo      3.336s

C:\Users\anikumar\Desktop\TestGo>go test -gcflags -N -run=none -bench=.
goos: windows
goarch: amd64
BenchmarkByValue-4         20961             59380 ns/op               3 B/op          1 allocs/op
BenchmarkByPointer-4    31386213                36.5 ns/op             3 B/op          1 allocs/op
PASS
ok      _/C_/Users/anikumar/Desktop/TestGo      3.909s 

============== Update ===================

Based on feedback from @zerkms, I created a test to find the performance difference between copy by value and copy by the pointer.

package main

import (
    "log"
    "time"
)

const size = 99999

// Data ...
type Data struct {
    array [size]string
}

func main() {
    // Preparing large data
    var data Data
    for i := 0; i < size; i++ {
        data.array[i] = "This is really long string"
    }

    // Starting test
    const max = 9999999999
    start := time.Now()
    for i := 0; i < max; i++ {
        byValue(data)
    }
    elapsed := time.Since(start)
    log.Printf("By Value took %s", elapsed)

    start = time.Now()
    for i := 0; i < max; i++ {
        byPointer(&data)
    }
    elapsed = time.Since(start)
    log.Printf("By Pointer took %s", elapsed)
}

func byValue(data Data) int {
    data.array[0] = reverse(data.array[0])
    return len(data.array)
}

func byPointer(data *Data) int {
    data.array[0] = reverse(data.array[0])
    return len(data.array)
}

func reverse(s string) string {
    runes := []rune(s)
    for i, j := 0, len(runes)-1; i < j; i, j = i+1, j-1 {
        runes[i], runes[j] = runes[j], runes[i]
    }
    return string(runes)
}

After 10 iterations of the above program, I did not find any difference in execution time.

 
 
 
 
  
  
  C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:52:03 By Value took 5.2798936s 2020/02/16 15:52:09 By Pointer took 5.3466306s C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:52:18 By Value took 5.3596692s 2020/02/16 15:52:23 By Pointer took 5.2724685s C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:52:29 By Value took 5.2359938s 2020/02/16 15:52:34 By Pointer took 5.2838676s C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:52:42 By Value took 5.8374936s 2020/02/16 15:52:49 By Pointer took 6.9524342s C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:53:40 By Value took 5.4364867s 2020/02/16 15:53:46 By Pointer took 5.8712875s C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:53:54 By Value took 5.5481591s 2020/02/16 15:54:00 By Pointer took 5.5600314s C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:54:10 By Value took 5.4753771s 2020/02/16 15:54:16 By Pointer took 6.4368084s C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:54:24 By Value took 5.4783356s 2020/02/16 15:54:30 By Pointer took 5.5312314s C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:54:39 By Value took 5.4853542s 2020/02/16 15:54:45 By Pointer took 5.5541164s C:\\Users\\anikumar\\Desktop\\TestGo>TestGo.exe 2020/02/16 15:54:57 By Value took 5.4633856s 2020/02/16 15:55:03 By Pointer took 5.4863226s
 
 
 

Looks like @zerkms is right. It is not because of language, it is because of modern hardware.

Meaningless microbenchmarks produce meaningless results.


In Go, all arguments are passed by value.


For your updated example (TestGo),

$ go version
go version devel +6917529cc6 Sat Feb 15 16:40:12 2020 +0000 linux/amd64
$ go run microbench.go
2020/02/16 13:12:56 By Value took 2.877045229s
2020/02/16 13:12:59 By Pointer took 2.875847918s
$

Go compilers are usually optimizing compilers. For example,

./microbench.go:39:6: can inline byValue
./microbench.go:43:6: can inline byPointer
./microbench.go:26:10: inlining call to byValue
./microbench.go:33:12: inlining call to byPointer

There is no function call overhead. Therefore, there is no difference in execution time.

microbench.go :

package main

import (
    "log"
    "time"
)

const size = 99999

// Data ...
type Data struct {
    array [size]string
}

func main() {
    // Preparing large data
    var data Data
    for i := 0; i < size; i++ {
        data.array[i] = "This is really long string"
    }

    // Starting test
    const max = 9999999999
    start := time.Now()
    for i := 0; i < max; i++ {
        byValue(data)
    }
    elapsed := time.Since(start)
    log.Printf("By Value took %s", elapsed)

    start = time.Now()
    for i := 0; i < max; i++ {
        byPointer(&data)
    }
    elapsed = time.Since(start)
    log.Printf("By Pointer took %s", elapsed)
}

func byValue(data Data) int {
    return len(data.array)
}

func byPointer(data *Data) int {
    return len(data.array)
}


ADDENDUM

Comment : @Anil8753 another thing to note is that Go standard library has a testing package which provides some useful functionality for benchmarking. For example next to your main.go file add a main_test.go file (the file name is important) and add these two benchmarks to it and then from inside the folder run this command go test -run=none -bench=., this will print how many operations were executed, how much time a single operation took, how much memory a single operation required, and how many allocations were required. – mkopriva


Go compilers are usually optimizing compilers. Modern hardware is usually heavily optimized.

For mkopriva's microbenchmark,

$ go test microbench.go mkopriva_test.go -bench=.
BenchmarkByValue-4     1000000000   0.289 ns/op   0 B/op   0 allocs/op
BenchmarkByPointer-4   1000000000   0.575 ns/op   0 B/op   0 allocs/op
$ 

However, for mkopriva's microbenchmark with a sink,

$ go test microbench.go sink_test.go -bench=.
BenchmarkByValue-4     1000000000   0.576 ns/op   0 B/op   0 allocs/op
BenchmarkByPointer-4   1000000000   0.592 ns/op   0 B/op   0 allocs/op
$ 

mkopriva_test.go :

package main

import (
    "testing"
)

func BenchmarkByValue(b *testing.B) {
    var data Data
    b.ReportAllocs()
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        byValue(data)
    }
}

func BenchmarkByPointer(b *testing.B) {
    var data Data
    b.ReportAllocs()
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        byPointer(&data)
    }
}

sink_test.go :

package main

import (
    "testing"
)

var banchInt int

func BenchmarkByValue(b *testing.B) {
    var data Data
    b.ReportAllocs()
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        banchInt = byValue(data)
    }
}

func BenchmarkByPointer(b *testing.B) {
    var data Data
    b.ReportAllocs()
    b.ResetTimer()
    for n := 0; n < b.N; n++ {
        banchInt = byPointer(&data)
    }
}

I think this is a really good question, and I don't know why people have marked it down. (That is, the original question of using a "const pointer" to pass a large struct.)

The simple answer is that Go has no way to indicate that a function (or channel) taking a pointer is not going to modify the thing pointed to. Basically it is up to the creator of the function to document that the function will not modify the structure.

@Anil8753 as you explicitly mention channels I should explain something further. Typically when using a channel you are passing data to another go-routine. If you pass a pointer to the struct then the sender must be careful not to modify the struct after it has been sent (at least while the receiver could be reading it) and vice versa. This would create a data race.

For this reason I typically pass structs by value with channels. If you need to create something in the sender for exclusive use of the receiver then create a struct (on the heap) and send a pointer to it in the channel and never use it again (even assigning nil to the pointer to make this explicit).

@zerkms makes a very good point that before you optimize you should understand what is happening and make measurements. However, in this case there is an obvious performance benefit to not copying memory around. Whether this happens when the struct is 1KB, 1MB, or 1GB there will come a point where you want to pass by "reference" (ie a pointer to the struct) rather than by value (as long as you know the struct won't be modified or don't care if it is).

In theory and in practice copy by value will become very inefficient when the struct becomes large enough or the function is called many times.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM