简体   繁体   中英

Golang map with multiple keys per value

Consider the following XML data structure:

<MediaItems>
    <item url="media/somefolder/pic1.jpg" id="1">
        <groups>
            <group>1</group>
            <group>2</group>
        </groups>
    </item>
    <item url="media/somefolder/pic2.jpg" id="2">
        <groups>
            <group>3</group>
            <group>7</group>
        </groups>
    </item>
</MediaItems>

Since my XML data structure/file can potentially scale to 10000 or perhaps 100000+ media item elements, I need to be able to access the individual items, in the parsed Go map (or what structure to use here?), like we do with map[key]type - but I need to be able to use either the url or the id as a key, and I can't figure out how to create a map with 2 keys pointing to the same value.

From the parsed XML data structure above, I need to parse it in Go and store it in a type like:

map[string, string]MediaItem

Where the keys should be url and id, so I'd be able to get the item with id 1 doing myMap["1"] or myMap["media/somefolder/pic1.jpg"] . Both should return the corresponding MediaItem .)

I can't wrap my head around how to implement this, or maybe there a better way to achieve the same?

Staying with the map type, you can use 2 (3) different solutions:

With 2 maps

Easiest would be to build 2 maps, 1 where the keys are the urls, and 1 where the keys are the ids:

var byUrlMap map[string]*MediaItem
var byIdMap map[string]*MediaItem

Note that you should store pointers instead of structs for example to avoid duplicating the values.

If you need a MediaItem by id:

mi := byIdMap[id]

Similarly by url:

mi2 := byUrlMap[url]

With key prefixes

Another option can be to prefix the actual key values, but this is not so efficient, but as a result, you'll have only one map.

For example you could prefix URL keys with "url:" and ids with "id:" and store the same pointer value of course for both the url and id keys, for example:

var miMap = make(map[string]*MediaItem)

mi := &MediaItem{}
miMap["url:http://something"] = mi
miMap["id:1"] = mi

And getting an element:

mi2 := miMap["id:" + id]   // By id
mi3 := miMap["url:" + url] // By url

Using keys "as-is"

This is something similar to "With key prefixes": if you have guarantee that the URLs and ids will never be the same (meaning you will never have an id that is the same as another item's url and vice versa), you can simply use both keys without prefixes, and set the same value (pointer) to them.

The purpose of key prefixes was to make sure a final url key will never be the same as a final id key (achieved by using different prefixes for these 2 types of keys). But if this condition is naturally true (eg the string value of a numeric id will never be a valid url), we don't really need the prefixes:

var miMap = make(map[string]*MediaItem)

mi := &MediaItem{}
miMap["http://something"] = mi
miMap["1"] = mi

And getting an element:

mi2 := miMap[id]  // By id
mi3 := miMap[url] // By url

Better solution would be to use struct with two field as a key:

type key struct {
    url string
    id  int
}

m := make(map[key]MediaItem)
m[key{url: "http://...", id: 2}] = MediaItem{}

All these solutions are fine except they fall apart if you need to remove elements from the map. For example, let's keep a collection of IP addresses accessible by both IP address and node ID of some kind. You determine that an IP address is no longer active and need to remove it. If you remove the item from the IP map, you leave a pointer to a crash in the node ID map. And since you only possibly have the IP address at the time you are removing it, there's no way to lookup which ID entry corresponds to that IP Address to remove that also.

Anyone found a clean solution to that problem?

Possibly off topic but there is also another closely related problem common to almost every language: their built-in maps are not sorted and cannot contain duplicates.

In my above example of IP address maps, you may need to delete items that are older than a certain number of minutes. There's no way to jump right to the "oldest" elements. The entire map must be scanned to find them. Whoever provided these simple maps, well, why couldn't they have provided a sorted map? It's needed all over the place and really, for small collections a BTree can be faster than a hashmap. (No upfront hash algo needed and no collision mediation necessary). The C++ multimap and various other collections are what I'm talking about but granted, they are not part of the language per se but library code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM