简体   繁体   中英

Data structures to implement unknown table schema in c/c++?

Our task is to read information about table schema from a file, implement that table in c/c++ and then successfully run some "select" queries on it. The table schema file may have contents like this,

    Tablename- Student
    "ID","int(11)","NO","PRIMARY","0","".

Now, my question is what data structures would be appropriate for the task. The problem is that I do not know the number of columns a table might have, neither as to what might the name of those columns be nor any idea about their data types. For example, a table might have just one column of type int, another might have 15 columns of varying data types. Infact, I don't even know the number of tables whose description the schema file might have.

One way I thought of was to have a set number of say, 20 vectors (assuming that the upper limit of the columns in a table is 20), name those vectors 1stvector, 2ndvector and so on, map the name of the columns to the vectors, and then use them accordingly. But it seems the code for it would be a mess with all those if/else statements or switch case statements (for the mapping).

While googling/stack-overflowing, I learned that you can't describe a class at runtime otherwise the problem might have been easier to solve.

Any help is appreciated. Thanks.

As a C++ data structure, you could try a std::vector< std::vector<boost::any> > . A vector is part of the Standard Library and allows dynamic rescaling of the number of elements. A vector of vectors would imply an arbitrary number of rows with an arbitray number of columns. Boost.Any is not part of the Standard Library but widely available and allows storing arbitrary types.

I am not aware of any good C++ library to do SQL queries on that data structure. You might need to write your own. Eg the SQL commands select and where would correspond to the STL algorithm std::find_if with an appropriate predicate passed as a function object.

To deal with the lack of knowledge about the data column types you almost have to store the raw input (ie strings which suggests std:string ) and coerce the interpretation as needed later on.

This also has the advantage that the column names can be stored in the same type.

If you realy want to determine the column type you'll need to speculatively parse each column of input to see what it could be and make decisions on that basis.

Either way if the input could contain a column that has the column separation symbol in it (say a string including a space in otherwise white space separated data) you will have to know the quoting convention of the input and write a parses of some kind to work on the data (sucking whole lines in with getline is your friend here). Your input appears to be comma separated with double quote deliminated strings.

I suggest using std::vector to hold all the table creation statements. After all the creation statements are read in, you can construct your table.

The problem to overcome is the plethora of column types. All the C++ containers like to have a uniform type, such as std::vector<std::string> . You will have different column types.

One solution is to have your data types descend from a single base. That would allow you to have std::vector<Base *> for each row of the table, where the pointers can point to fields of different {child} types.

I'll leave the rest up to the OP to figure out.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM