简体   繁体   English

引擎盖下的 Go 地图

[英]Go's maps under the hood

After reading Dave Cheney's blogpost about Go's maps there is still few things unclear to me.在阅读了 Dave Cheney 关于 Go 地图的博客文章后,我仍然有一些不清楚的地方。

TLDR:域名注册地址:

  • Why are they unordered?为什么它们是无序的?
  • Where actual values are stored in memory?实际值存储在内存中的什么位置?

After digging in runtime package I found out that underlying map structure is following:在挖掘运行时包后,我发现底层地图结构如下:

// A header for a Go map.
type hmap struct {
    // Note: the format of the hmap is also encoded in cmd/compile/internal/gc/reflect.go.
    // Make sure this stays in sync with the compiler's definition.
    count     int // # live cells == size of map.  Must be first (used by len() builtin)
    flags     uint8
    B         uint8  // log_2 of # of buckets (can hold up to loadFactor * 2^B items)
    noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details
    hash0     uint32 // hash seed

    buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.
    oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing
    nevacuate  uintptr        // progress counter for evacuation (buckets less than this have been evacuated)

    extra *mapextra // optional fields
}

buckets - is array of buckets where indexes is low-order bits of key's hash, where the bucket is: buckets - 是buckets的数组,其中索引是密钥散列的低位,其中桶是:

// A bucket for a Go map.
type bmap struct {
    // tophash generally contains the top byte of the hash value
    // for each key in this bucket. If tophash[0] < minTopHash,
    // tophash[0] is a bucket evacuation state instead.
    tophash [bucketCnt]uint8
    // Followed by bucketCnt keys and then bucketCnt elems.
    // NOTE: packing all the keys together and then all the elems together makes the
    // code a bit more complicated than alternating key/elem/key/elem/... but it allows
    // us to eliminate padding which would be needed for, e.g., map[int64]int8.
    // Followed by an overflow pointer.
}

..well it's just array of uint8 where every item is first byte of key's hash. ..好吧,它只是uint8数组,其中每个项目都是密钥散列的第一个字节。 And key-value pairs are stores as key/key value/value (eight pairs per bucket).键值对存储为key/key value/value (每个桶八对)。 But where exactly?但具体在哪里? Considering that map may contain value of (almost) any type.考虑到该映射可能包含(几乎)任何类型的值。 There should be some kind of pointer to place in memory where array of values stored, but bmap doesn't have such info.应该有某种指针指向存储值数组的内存,但是bmap没有这样的信息。

And since key's hashes are located in ordered array inside bucket, why it's order different every time I looping over map?并且由于键的散列位于存储桶内的有序数组中,为什么每次循环遍历地图时它的顺序都不同?

  • Why are they unordered?为什么它们是无序的?

Because this gives greater freedom to the runtime to implement the map type.因为这为运行时实现地图类型提供了更大的自由 Although we know Go's (current) implementation is a hashmap, the language specification allows to use any map implementation like hash map, tree map etc. Also not having to remember the order, this allows the runtime to do its job more effectively and using less memory.虽然我们知道 Go 的(当前)实现是一个 hashmap,但语言规范允许使用任何 map 实现,如 hash map、tree map 等。而且不必记住顺序,这允许运行时更有效地完成它的工作并使用更少记忆。

Adrian's comment nicely summarizes that order is rarely needed, and it would be a waste to always maintain order. Adrian 的评论很好地总结了很少需要秩序,总是维持秩序是一种浪费。 When you do need order, you may use a data structure that provides the ordering.当您确实需要排序时,您可以使用提供排序的数据结构。 For examples, see Map in order range loop .有关示例,请参阅Map in order range loop

And since key's hashes are located in ordered array inside bucket, why it's order different every time I looping over map?并且由于键的散列位于存储桶内的有序数组中,为什么每次循环遍历地图时它的顺序都不同?

The Go authors intentionally made map's iteration order randomized (so we mortals don't get dependent on a fixed order). Go 作者有意使地图的迭代顺序随机化(因此我们凡人不会依赖于固定顺序)。 For more, see In Golang, why are iterations over maps random?有关更多信息,请参阅在 Golang 中,为什么地图上的迭代是随机的?

Also see related: Why can't Go iterate maps in insertion order?另请参阅相关: 为什么 Go 不能按插入顺序迭代地图?

  • Where actual values are stored in memory?实际值存储在内存中的什么位置?

The "where" is specified by hmap.buckets . “where”由hmap.buckets指定。 This is a pointer value, it points to an array in memory, an array holding the buckets.这是一个指针值,它指向内存中的一个数组,一个保存桶的数组。

buckets    unsafe.Pointer // array of 2^B Buckets. may be nil if count==0.

So hmap.buckets points to a contiguous memory segment holding buckets.所以hmap.buckets指向一个保存桶的连续内存段。 A bucket is "modeled" by bmap , but this is not its actual memory layout.存储桶由bmap “建模”,但这不是它的实际内存布局。 A bucket starts with an array holding top hash bytes of keys being in the bucket ( tophash [bucketCnt]uint8 ), and this array is followed by bucketCnt keys of the bucket, which is then followed by bucketCnt values of the bucket .一个存储桶以一个数组开始,该数组包含存储桶中键的顶部哈希字节( tophash [bucketCnt]uint8 ),该数组后面是存储桶的bucketCnt键,然后是存储桶的bucketCnt Lastly there is an overflow pointer.最后有一个溢出指针。

Think of the bucket like this conceptual type, which "visualizes" where keys and values are located in memory:将存储桶想象为这种概念类型,它“可视化”了键和值在内存中的位置:

type conceptualBucket struct {
    tophash     [bucketCnt]uint8
    keys        [bucketCnt]keyType
    values      [bucketCnt]valueType
    overflowPtr uintptr
}

Note: bucketCnt is a compile time constant being 8 , it is the maximum number of key/elem pairs a bucket can hold.注意: bucketCnt是一个编译时间常数8 ,它是一个桶可以容纳的最大键/元素对数。

Of course this "picture" is inaccurate, but it gives the idea where / how keys and values are stored.当然,这个“图片”是不准确的,但它给出了键和值的存储位置/方式的想法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM