简体   繁体   中英

Different Types for the Same Column

My database stores version numbers; however, they come in two formats: major.minor.build (eg 8.2.0, 12.0.1) and dates (eg YY-MM-DD). I had thought of two solutions:

+---+---+-----+-----------+ +-----+-----+-----+-----+ +-----+--------+
|...|...|id   |versionType| |id   |major|minor|build| |id   |date    |
|---+---+-----+-----------| |-----+-----+-----+-----| |-----+--------|
|...|...|12345|0          | |12345|0    |1    |2    | |21432|12-04-05|
|---+---+-----+-----------| +-----+-----+-----+-----+ +-----+--------+
|...|...|21432|1          |
+---+---+-----+-----------+

or

+---+---+-----+-----+-----+-----+--------+
|...|...|id   |major|minor|build|date    |
|---+---+-----+-----+-----+-----+--------|
|...|...|12345|0    |1    |2    |null    |
|---+---+-----+-----+-----+-----+--------+
|...|...|21432|null |null |null |12-04-05|
+---+---+-----+-----+-----+-----+--------+

Neither of these look particularly efficient: the first requires a join across two tables just to get a version number, while the second wastes twice as much space per version entry compared to the first. Alternatively, I could just store the value in some amount of bits in a column then interpret that on the client side, but I'm hoping that there's some standard practice for this situation that I've overlooked.

Is there a proper way to store two different types of data for the same 'column' in a relational database?

Is your situation one where you have different distinct kinds of versioned object, where one kind of versioning is using dates, and another kind of versioning is using version numbers, or is your situation one where the same kind of object's version is referenced both using dates and using version numbers ?

In the first case, don't bother with creating such an artificial table that doesn't serve any useful purpose. You need to create tables only if they solve a business problem that really exists, and the translation from version date to version number or vice-versa is one that doesn't exist in this situation. And even if it arises later on, then you can still ...

In the second case, define a table like the one in your second option, but :

WITHOUT all those stupid meaningless ID's. Just leave the four columns maj/min/bld/date. And DONT MAKE ANY OF THEM NULLABLE. Define two keys : maj/min/bld and date. Register a row for each new build, recording the creation (/activation/whatever ...) date of the build. Use the maj/min/bld construct as the version indicator in whatever table describes the versioned object you are managing, and whenever a request comes in in which the version reference is done using a date, resolve this to a version number through a query on your 4-column table.

The First one is better. If you feel that everytime doing the join is a headache for you, then you can create a view (with the join) and use that view rather than directly using the tables (or doing join each time).

I don't think there's a silver bullet here. If you are adamant about having a single column you could just naively slapping them both into a CHAR(10), though this has its own issues (eg, invalid dates, or malformed build number strings, etc).

I think the key question is really what sort of queries do you envision running, and how many rows do you expect to have?

I'd let the anticipated query need drive the DB design.

Possibly you can store your data in one int/bigint field. But in this case you will have to convert all values: Date - convert it to the number of days starting from some date or possibly you can use unixtimestamp Version - give limits to values of built (let it be max 1000), minor (1000). = major*1000*1000 + minor*1000 + build

I would go with the second option because the grain of the table is a version, no matter how it comes in. Storing it in 3 number columns and 1 date column should only be 16 bytes per row (4 bytes for each number and date column) on a database like Oracle. If your database handles virtual (calculated) columns you could add one that looks something like nvl(to_char(date),major||'.'||minor||'.'||build) and always select from it with the data formatted as a varchar or you can put a view on it.

It really depends on how you want to use this version number in queries. If it's enough to treat version numbers as labels, then you should consider storing it in a varchar column.

If that is not enough, and you want to be able to sort on version number, things get more complex, because it would then be more convenient to store the version number in a data type that preserves natural sorting order. I'd probably go with your solution 2. Is there a reason to worry about efficiency? Are you expecting millions of rows? If not, then you probably shouldnt worry too much.

If space, speed, and volume are definite considerations, you could consider storing the actual version number as a text label, and deriving a surrogate version number (a single integer) that you store in a separate column which you use for sorting purposes.

If on the other hand it turns out some objects have a version date, some a version number, and sometimes both, I would still consider your second solution. If you really want an uber-normalized model, you could consider creating separate version_date and version_number tables, which have a 1:1 relationship with the object table. And you would still need those joins. Again though -there is no reason to worry about that: databases are good at joins, it's what they were made to do well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM