简体   繁体   中英

Storing Multiple data type

I want to store user metadata setting for user. but the metadata value is multiple datatype, can be integer, string, date, or boolean.

So I came with my own solution

user(user_id(PK), ...)

meta(
   meta_id (PK)
 , user_id (FK)
 , data_type
 , meta_name
 , ...)

meta_user(
   user_id(FK)
 , meta_id(FK)
 , number_value
 , decimal_value
 , string_value
 , date_value
 , time_value
 , boolean_value)

But I'm not sure that's the right way to store multiple data type. I hope some one here can help me share their solution.

UPDATE:
User may have many metadata, and user must first register their metadata.

My contribution to this is:

  1. As you should implement names for the metadata, add a metadefinition table:

     meta_definition (metadefinition_ID(PK), name (varchar), datatype) 
  2. Then modify your meta_user table

     meta_user (meta_ID (PK), user_ID(FK), metadefinition_ID(FK)) 
  3. You have 3 choices (A,B,C) or even more...

Variant A: Keep your design of storing the values for all possible datatypes in one single row resulting in a sparse table.

This is the easiest to implement but the ugliest in terms of 'clean design' (my opinion).

Variant B: Use distinct value tables per data type: Instead of having one sparse table meta_user you could use 6 tables meta_number , meta_decimal , meta_string . Each table has the form:

meta_XXXX (metadata_ID(PK), meta_ID(FK), value)

Imho, this is the cleanest design (but a bit complicated to work with).

Variant C: reduce meta_user to hold three columns (I renamed it to meta_values , as it hold values and not users).

meta_values (metavalue_ID(PK), meta_ID(FK), value (varchar))

Format all values as string/varchar and stuff them into the value column. This is not well designed and a bad idea if you are going to use the values within SQL as you would have to do expensive and complicated casting in order to use the 'real' values.

This is imho the most compact design.

To list all metadata of a specific user, you can use

select u.name, 
       md.name as 'AttributeName',
       md.DataType
  from user u 
       join meta_user mu on u.user_ID = mu.userID
       join meta_definition md on md.metadefinition_ID = mu. metadefinition_ID

selecting the values for a given user would be Variant A: select u.name, md.name as 'AttributeName', mv.* -- show all different data types from user u join meta_user mu on u.user_ID = mu.userID join meta_definition md on md.metadefinition_ID = mu. metadefinition_ID join meta_value mv on mv.meta_ID = mu.metaID

Disadvantage: When new datatypes are available, you would have to add a column, recompile the query and change your software as well.

select u.name, 
       md.name as 'AttributeName',
       mnum.value as NumericValue,
       mdec.value as DecimalValue
...
  from user u 
       join meta_user mu on u.user_ID = mu.userID
       join meta_definition md on md.metadefinition_ID = mu. metadefinition_ID
       left join meta_numeric mnum on mnum.meta_ID = mu.metaID
       left join meta_decimal mdec on mdec.meta_ID = mu.metaID
...

Disadvantage: Slow if many users and attributes are being stored. Needs a new table when a new datatype is being introduced.

Variant C:

select u.name, 
       md.name as 'AttributeName',
       md.DataType -- client needs this to convert to original datatype
       mv.value    -- appears formatted as string
  from user u 
       join meta_user mu on u.user_ID = mu.userID
       join meta_definition md on md.metadefinition_ID = mu. metadefinition_ID
       join meta_value mv on mv.meta_ID = mu.metaID

Advantage: Don't have to change the query in case new datatypes are being introduced.

Every data type in PostgreSQL can be cast to text , which is therefore the natural common ground for data or variable type.

I suggest you have a look at the additional module hstore . I quote the manual:

This module implements the hstore data type for storing sets of key/value pairs within a single PostgreSQL value. This can be useful in various scenarios, such as rows with many attributes that are rarely examined, or semi-structured data. Keys and values are simply text strings.

That's a fast, proven, versatile solution and easy to extend for more attributes.

In addition you could have a table where you register meta-data (like data type and more) for each attribute. You could use this meta-data to check whether the attribute can be cast to its attached type (and passes additional tests stored in the meta-table) in a trigger on INSERT OR UPDATE to maintain integrity.

Be aware that many standard-features of the RDBMS are not easily available for such a storage regime. Basically you run into most of the problems you have with an EAV model ("entity-attribute-value"). You can find very good advice for that under this related question on dba.SE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM