My objective is to take a map[string]int
containing potentially up to a million entries and chunk it in sizes of up to 500 and POST the map to an external service. I'm newer to golang, so I'm tinkering in the Go Playground for now.
Any tips anyone has on how to improve the efficiency of my code base, please share!
Playground: https://play.golang.org/p/eJ4_Pd9X91c
The CLI output I'm seeing is:
original size 60
chunk bookends 0 20
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,
chunk bookends 20 40
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,
chunk bookends 40 60
0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,
The problem here is that while the chunk bookends are being calculated correctly, the x
value is starting at 0 each time. I think I should expect it to start at the chunk bookend minimum, which would be 0, 20, 40, etc. How come the range is starting at zero each time?
Source:
package main
import (
"fmt"
"math/rand"
"strconv"
)
func main() {
items := make(map[string]int)
// Generate some fake data for our testing, in reality this could be 1m entries
for i := 0; i < 60; i ++ {
// int as strings are intentional here
items[strconv.FormatInt(int64(rand.Int()), 10)] = rand.Int()
}
// Create a map of just keys so we can easily chunk based on the numeric keys
i := 0
keys := make([]string, len(items))
for k := range items {
keys[i] = k
i++
}
fmt.Println("original size", len(keys))
//batchContents := make(map[string]int)
// Iterate numbers in the size batch we're looking for
chunkSize := 20
for chunkStart := 0; chunkStart < len(keys); chunkStart += chunkSize {
chunkEnd := chunkStart + chunkSize
if chunkEnd > len(items) {
chunkEnd = len(items)
}
// Iterate over the keys
fmt.Println("chunk bookends", chunkStart, chunkEnd)
for x := range keys[chunkStart:chunkEnd] {
fmt.Print(x, ",")
// Build the batch contents with the contents needed from items
// @todo is there a more efficient approach?
//batchContents[keys[i]] = items[keys[i]]
}
fmt.Println()
// @todo POST final batch contents
//fmt.Println(batchContents)
}
}
When you process a chunk:
for x := range keys[chunkStart:chunkEnd] {}
You are iterating over a slice, and having one iteration variable, it will be the slice index, not the element from the slice (at the given index). Hence it will always start at 0
. (When you iterate over a map, first iteration variable is the key because there is no index there, and the second is the value associated with that key.)
Instead you want this:
for _, key := range keys[chunkStart:chunkEnd] {}
Also note that it's redundant to first collect the keys in a slice, and then process them. You may do that when iterating over the map once, at first. Just keep a variable counting the iterations to know when you reach the chunk size, which may be implicit if you use data structures that keeps this (eg the size of a keys batch slice).
For example (try it on the Go Playground ):
chunkSize := 20
batchKeys := make([]string, 0, chunkSize)
process := func() {
fmt.Println("Batch keys:", batchKeys)
batchKeys = batchKeys[:0]
}
for k := range items {
batchKeys = append(batchKeys, k)
if len(batchKeys) == chunkSize {
process()
}
}
// Process last, potentially incomplete batch
if len(batchKeys) > 0 {
process()
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.