简体   繁体   中英

Why does std::map operator[] create an object if the key doesn't exist?

I'm pretty sure I already saw this question somewhere (comp.lang.c++? Google doesn't seem to find it there either) but a quick search here doesn't seem to find it so here it is:

Why does the std::map operator[] create an object if the key doesn't exist? I don't know but for me this seems counter-intuitive if you compare to most other operator[] (like std::vector) where if you use it you must be sure that the index exists. I'm wondering what's the rationale for implementing this behavior in std::map. Like I said wouldn't it be more intuitive to act more like an index in a vector and crash (well undefined behavior I guess) when accessed with an invalid key?

Refining my question after seeing the answers:

Ok so far I got a lot of answers saying basically it's cheap so why not or things similar. I totally agree with that but why not use a dedicated function for that (I think one of the comment said that in java there is no operator[] and the function is called put)? My point is why doesn't map operator[] work like a vector? If I use operator[] on an out of range index on a vector I wouldn't like it to insert an element even if it was cheap because that probably mean an error in my code. My point is why isn't it the same thing with map. I mean, for me, using operator[] on a map would mean: i know this key already exist (for whatever reason, i just inserted it, I have redundancy somewhere, whatever). I think it would be more intuitive that way.

That said what are the advantage of doing the current behavior with operator[] (and only for that, I agree that a function with the current behavior should be there, just not operator[])? Maybe it give clearer code that way? I don't know.

Another answer was that it already existed that way so why not keep it but then, probably when they (the ones before stl) choose to implement it that way they found it provided an advantage or something? So my question is basically: why choose to implement it that way, meaning a somewhat lack of consistency with other operator[]. What benefit do it give?

Thanks

Because operator[] returns a reference to the value itself and so the only way to indicate a problem would be to throw an exception (and in general, the STL rarely throws exceptions).

If you don't like this behavior, you can use map::find instead. It returns an iterator instead of the value. This allows it to return a special iterator when the value is not found (it returns map::end ) but also requires you to dereference the iterator to get at the value.

Standard says (23.3.1.2/1) that operator[] returns (*((insert(make_pair(x, T()))).first)).second . That's the reason. It returns reference T& . There is no way to return invalid reference. And it returns reference because it is very convenient I guess, isn't it?

To answer your real question: there's no convincing explanation as to why it was done that way. "Just because".

Since std::map is an associative container, there's no clear pre-defined range of keys that must exist (or not exist) in the map (as opposed to the completely different situation with std::vector ). That means that with std::map , you need both non-insering and inserting lookup functionality. One could overload [] in non-inserting way and provide a function for insertion. Or one could do the other way around: overload [] as an inserting operator and provide a function for non-inserting search. So, someone sometime decided to follow the latter approach. That's all there's to it.

If they did it the other way around, maybe today someone would be asking here the reverse version of your question.

Its is for assignment purposes:


void test()
{
   std::map<std::string, int >myMap;
   myMap["hello"] = 5;
}

I think it's mostly because in the case of map (unlike vector, for example) it's fairly cheap and easy to do -- you only have to create a single element. In the case of vector they could extend the vector to make a new subscript valid -- but if your new subscript is well beyond what's already there, adding all the elements up to that point may be fairly expensive. When you extend a vector you also normally specify the values of the new elements to be added (though often with a default value). In this case, there would be no way to specify the values of the elements in the space between the existing elements and the new one.

There's also a fundamental difference in how a map is typically used. With a vector, there's usually a clear delineation between things that add to a vector, and things that work with what's already in the vector. With a map, that's much less true -- it's much more common to see code that manipulates the item that's there if there is one, or adds a new item if it's not already there. The design of operator[] for each reflects that.

It allows insertion of new elements with operator[] , like this:

std::map<std::string, int> m;
m["five"] = 5;

The 5 is assigned to the value returned by m["five"] , which is a reference to a newly created element. If operator[] wouldn't insert new elements this couldn't work that way.

map.insert(key, item); makes sure key is in the map but does not overwrite an existing value.

map.operator[key] = item; makes sure key is in the map and overwrites any existing value with item.

Both of these operations are important enough to warrant a single line of code. The designers probably picked which operation was more intuitive for operator[] and created a function call for the other.

The difference here is that map stores the "index", ie the value stored in the map (in its underlying RB tree) is a std::pair , and not just "indexed" value. There's always map::find() that would tell you if pair with a given key exists.

The answer is because they wanted an implementation that is both convenient and fast.

The underlying implementation of a vector is an array. So if there are 10 entries in the array and you want entry 5, the T& vector::operator[](5) function just returns headptr+5. If you ask for entry 5400 it returns headptr+5400.

The underlying implementation of a map is usually a tree. Each node is allocated dynamically, unlike the vector which the standard requires to be contiguous. So nodeptr+5 doesn't mean anything and map["some string"] doesn't mean rootptr+offset("some string").

Like find with maps, vector has getAt() if you want bounds checking. In the case of vectors, bounds checking was considered an unnecessary cost for those who did not want it. In the case of maps, the only way not to return a reference is to throw an exception and that was also considered an unnecessary cost for those who did not want it.

Consider such an input - 3 blocks, each block 2 lines, first line is the number of elements in the second one:

5
13 20 22 43 146
4
13 22 43 146
5
13 43 67 89 146

Problem: calculate the number of integers that are present in second lines of all three blocks. (For this sample input the output should be 3 as far as 13, 43 and 146 are present in second lines of all three blocks)

See how nice is this code:

int main ()
{
    int n, curr;
    map<unsigned, unsigned char> myMap;
    for (int i = 0; i < 3; ++i)
    {
        cin >> n;
        for (int j = 0; j < n; ++j)
        {
            cin >> curr;
            myMap[curr]++;
        }

    }

    unsigned count = 0;
    for (auto it = myMap.begin(); it != myMap.end(); ++it)
    {
        if (it->second == 3)
            ++count;
    }

    cout << count <<endl;
    return 0;
}

According to the standard operator[] returns reference on (*((insert(make_pair(key, T()))).first)).second . That is why I could write:

myMap[curr]++;

and it inserted an element with key curr and initialized the value by zero if the key was not present in the map. And also it incremented the value, in spite of the element was in the map or no.

See how simple? It is nice, isn't it? This is a good example that it is really convenient.

I know this is old question but no one seems to have answered it well IMO. So far I haven't seen any mention of this:

The possibility of undefined behavior is to be avoided! If there is any reasonable behavior besides UB, then I imagine we should go with that.

std::vector/array exhibits undefined behavior with a bad operator[] index because there is really no reasonable option, since this is one of the fastest, most fundamental things you can do in c/c++, and it would be wrong to try to check anything. Checking is what at() is for.

std::*associative_container* has already done the work of finding where an indexed element would go, so it makes sense to create one there and return it. This is very useful behavior, and alternatives to operator[] are much less clean looking, but even if creating and inserting a new item is not what you wanted, or is not useful to you, it is still a much better result than undefined behavior.

I think operator[] is much preferred syntax for using an associative container, for readability, and to me this is very intuitive, and matches exactly the concept of operator[] for arrays: return a reference to the item at that position, to use or to assign to .

If my intuition for "what if there is nothing there" was only "undefined behavior", then I would be absolutely no worse off, since I would be doing all I could do avoid that, full stop.

Then one day I find out that I can insert an item with operator[] ... life is just better.

If you want to read an element with some key from an std::map,
but you are unsure whether it exists,
and in case it doesn't, you don't want to insert it by accident,
but rather want to get an exception thrown,
but you also don't want to manually check map.find(key) != map.end() everytime you read an element,

just use map::at(key) (C++11)

https://www.cplusplus.com/reference/map/map/at/

It it not possible to avoid the creation of an object, because the operator[] doesn't know how to use it.

myMap["apple"] = "green";

or

char const * cColor = myMyp["apple"];

I propose the map container should add an function like

if( ! myMap.exist( "apple")) throw ...

it is much simpler and better to read than

if( myMap.find( "apple") != myMap.end()) throw ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM