简体   繁体   中英

Database design, reducing join

This is a db for online store for tickets (such as Airbnb experience)

For a product (ticket),

there are available days (and times)

On an available day,
- there could be multiple options (such as beginner-class, advanced-class)
- there is a quantity that can be sold (shared among multiple options)

One way to represent this is

Product
  name

Variant (Option)
  product

TimeSlot
  product
  date
  time
  quantity


TimeslotVariant
  variant
  timeslot

Another way would be the following.

I see two main differences,

  • First difference

    • Above: you need join on TimeSlot to find what variants are on given day.
    • Below: you can directly query TimeVariant
  • Second difference

    • Above: [{date, time, [variant1, variant2], quantity}] (I think client application would prefer this)
    • Below: [{date, time, variant1}, {date, time, variant2}] + [{date, time, quantity}]

Product
  name

Variant (Option)
  product


TimeSlot
  product
  date
  time
  quantity

TimeVariant
  variant
  date
  time

I think the first option is more intuitive (?) but I also think additional join can be painful to maintain sometimes

What questions (criteria) should I ask to myself to decide among the two?

IMHO, The number of joins is probably not the most important question you need to ask yourself when designing a relational database.

the most important question is how can you make sure you protect the data integrity. The data integrity is best kept with the first option you've presented, so this is the option you should go with.

If the joins is what's bothering you, you can always use views to "flatten" the data.

But why the first option is better? Because the primary key (or, at least the natural key) of the TimeSlot table must be comprised of product , date and time - and the second option doesn't take the product into consideration in the TimeVariant table.

You could add the product to that table as well, and some DBAs would suggest that as the best option (those would be the DBAs that are opposing to using surrogate keys) - but personally, even though I'm not a DBA myself I think that a surrogate key have it's advantages and one of them is exactly what you have here - you can use a single column instead of three to join two tables - which makes your life much easier (and with enforcing uniqueness of the natural key(s) of the table there's no integrity problem with surrogate keys).

First you should think do you want to reduce joining overheads? If yes then you may use flatten data stored in the views.

First Case

In the TimeslotVariant you have added variant and timeslot through which only by querying the TimeslotVariant table you will get your intended data. In this case your data integrity in okay.

Second Case

If you add product key to your TimeslotVariant then only to show the product listing will serve the purpose though time and date are also getting re-entry.

My suggestion is to keep the first case and store the flatten data in a view.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM