简体   繁体   中英

Best way to store constant data in C++

I have an array of constant data like following:

enum Language {GERMAN=LANG_DE, ENGLISH=LANG_EN, ...};
struct LanguageName {
    ELanguage language;
    const char *name;
};

const Language[] languages = {
    GERMAN, "German",
    ENGLISH, "English",
    .
    .
    .
};

When I have a function which accesses the array and find the entry based on the Language enum parameter. Should I write a loop to find the specific entry in the array or are there better ways to do this.

I know I could add the LanguageName-objects to an std::map but wouldn't this be overkill for such a simple problem? I do not have an object to store the std::map so the map would be constructed for every call of the function.

What way would you recommend?

Is it better to encapsulate this compile time constant array in a class which handles the lookup?

If the enum values are contiguous starting from 0 , use an array with the enum as index.

If not, this is what I usually do:

const char* find_language(Language lang)
{
  typedef std::map<Language,const char*> lang_map_type;
  typedef lang_map_type::value_type lang_map_entry_type;

  static const lang_map_entry_type lang_map_entries[] = { /*...*/  }
  static const lang_map_type lang_map( lang_map_entries
                                     , lang_map_entries + sizeof(lang_map_entries)
                                                        / sizeof(lang_map_entries[0]) );
  lang_map_type::const_iterator it = lang_map.find(lang);
  if( it == lang_map.end() ) return NULL;
  return it->second;
}

If you consider a map for constants, always also consider using a vector.

Function-local statics are a nice way to get rid of a good part of the dependency problems of globals, but are dangerous in a multi-threaded environment. If you're worried about that, you might rather want to use globals:

typedef std::map<Language,const char*> lang_map_type;
typedef lang_map_type::value_type lang_map_entry_type;

const lang_map_entry_type lang_map_entries[] = { /*...*/  }
const lang_map_type lang_map( lang_map_entries
                            , lang_map_entries + sizeof(lang_map_entries)
                                               / sizeof(lang_map_entries[0]) );

const char* find_language(Language lang)
{
  lang_map_type::const_iterator it = lang_map.find(lang);
  if( it == lang_map.end() ) return NULL;
  return it->second;
}

There are three basic approaches that I'd choose from. One is the switch statement, and it is a very good option under certain conditions. Remember - the compiler is probably going to compile that into an efficient table-lookup for you, though it will be looking up pointers to the case code blocks rather than data values.

Options two and three involve static arrays of the type you are using. Option two is a simple linear search - which you are (I think) already doing - very appropriate if the number of items is small.

Option three is a binary search. Static arrays can be used with standard library algorithms - just use the first and first+count pointers in the same way that you'd use begin and end iterators. You will need to ensure the data is sorted (using std::sort or std::stable_sort), and use std::lower_bound to do the binary search.

The complication in this case is that you'll need a comparison function object which acts like operator< with a stored or referenced value, but which only looks at the key field of your struct. The following is a rough template...

class cMyComparison
{
  private:
    const fieldtype& m_Value;  //  Note - only storing a reference

  public:
    cMyComparison (const fieldtype& p_Value) : m_Value (p_Value) {}

    bool operator() (const structtype& p_Struct) const
    {
      return (p_Struct.field < m_Value);
        //  Warning : I have a habit of getting this comparison backwards,
        //            and I haven't double-checked this
    }
};

This kind of thing should get simpler in the next C++ standard revision, when IIRC we'll get anonymous functions (lambdas) and closures.

If you can't put the sort in your apps initialisation, you might need an already-sorted boolean static variable to ensure you only sort once.

Note - this is for information only - in your case, I think you should either stick with linear search or use a switch statement. The binary search is probably only a good idea when...

  1. There are a lot of data items to search
  2. Searches are done very frequently (many times per second)
  3. The key enumerate values are sparse (lots of big gaps) - otherwise, switch is better.

If the coding effort were trivial, it wouldn't be a big deal, but C++ currently makes this a bit harder than it should be.

One minor note - it may be a good idea to define an enumerate for the size of your array, and to ensure that your static array declaration uses that enumerate. That way, your compiler should complain if you modify the table (add/remove items) and forget to update the size enum, so your searches should never miss items or go out of bounds.

I think you have two questions here:

  • What is the best way to store a constant global variable (with possible Multi-Threaded access) ?
  • How to store your data (which container use) ?

The solution described by sbi is elegant, but you should be aware of 2 potential problems:

  1. In case of Multi-Threaded access, the initialization could be skrewed.
  2. You will potentially attempt to access this variable after its destruction.

Both issues on the lifetime of static objects are being covered in another thread .

Let's begin with the constant global variable storage issue.

The solution proposed by sbi is therefore adequate if you are not concerned by 1. or 2., on any other case I would recommend the use of a Singleton, such as the ones provided by Loki . Read the associated documentation to understand the various policies on lifetime, it is very valuable.

I think that the use of an array + a map seems wasteful and it hurts my eyes to read this. I personally prefer a slightly more elegant (imho) solution.

const char* find_language(Language lang)
{
  typedef std::map<Language, const char*> map_type;
  typedef lang_map_type::value_type value_type;

  // I'll let you work out how 'my_stl_builder' works,
  // it makes for an interesting exercise and it's easy enough
  // Note that even if this is slightly slower (?), it is only executed ONCE!
  static const map_type = my_stl_builder<map_type>()
                          << value_type(GERMAN, "German")
                          << value_type(ENGLISH, "English")
                          << value_type(DUTCH, "Dutch")
                          ....
                          ;

  map_type::const_iterator it = lang_map.find(lang);
  if( it == lang_map.end() ) return NULL;
  return it->second;
}

And now on to the container type issue.

If you are concerned about performance, then you should be aware that for small data collection, a vector of pairs is normally more efficient in look ups than a map. Once again I would turn toward Loki (and its AssocVector), but really I don't think that you should worry about performance.

I tend to choose my container depending on the interface I am likely to need first and here the map interface is really what you want.

Also: why do you use 'const char*' rather than a 'std::string'?

I have seen too many people using a 'const char*' like a std::string (like in forgetting that you have to use strcmp) to be bothered by the alleged loss of memory / performance...

It depends on the purpose of the array. If you plan on showing the values in a list (for a user selection, perhaps) the array would be the most efficient way of storing them. If you plan on frequently looking up values by their enum key, you should look into a more efficient data structure like a map.

There is no need to write a loop. You can use the enum value as index for the array.

I would make an enum with sequential language codes

enum { GERMAN=0, ENGLISH, SWAHILI, ENOUGH };

The put them all into array

const char *langnames[] = {
    "German", "English", "Swahili"
};

Then I would check if sizeof(langnames)==sizeof(*langnames)*ENOUGH in debug build.

And pray that I have no duplicates or swapped languages ;-)

If you want fast and simple solution , Can try like this

enum ELanguage {GERMAN=0, ENGLISH=1};

    static const string Ger="GERMAN";
    static const string Eng="ENGLISH";

    bool getLanguage(const ELanguage& aIndex,string & arName)
    {
        switch(aIndex)
        {
        case GERMAN:
            {
                arName=Ger;
                return true;
            }
        case ENGLISH:
            {
                arName=Eng;
            }
        default:
            {
                // Log Error
                return false;
            }
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM