简体   繁体   中英

SQL Normalization with multiple “measures” tables

I'm currently trying to redesign a Point of Sale database to make it more normalized, which will help tremendously with managing the data, etc. I'm a little bit unsure about the best design practices, based on the data I have to deal with. First of all there are basically two sets of measures, which share common keys. There is inventory data, units and dollars, and then point of sale data, units and dollars. Each of these is a customer, store, item and date level.

What I've done (mostly in theory at this point) is to create separates table for

Item level information 
  Item_ID, 
  Customer_ID
  itemnumber 
  (and a few other item specific information).  

Stores 
  Store_ID, 
  Customer_ID, 
  Store Number,
  (and essentially address information)

Customer 
  Customer_ID, 
  Customer Number 
  (other customer specific information like name).  

So in addition to those "support" tables, I have the

Main Inventory Data 
  Store_ID 
  Item_ID 

I also have POS Data table, with the exact same ID's.

Basically my questions are:

  • should I include the Customer ID in the Pos Data and Inventory Data tables, even though they are a part of both the stores and items tables?
  • My second question would be, if I do add the customer ID, if I would join all of these tables together,

    1. would I join the customer ID from all of the tables (Pos Data, Stores and Items OR Inventory Data, Stores and Items) to the customers table or
    2. would just joining from the Pos Data table be sufficient.

Let me give a few additional details, regarding the data. As an example, we have two Customers, CustomerA and CustomerB. CustomerA has several stores whose store numbers are 1000,1025, 1036 and 1037. CustomerB also has several stores, whose store numbers are 1025, 1030, and 1037. Store numbers 1025 and 1037 happen to be the same between customers, but the stores themselves are unique and completely different.

CustomerA's Store Number 1000 sells three of our items (this is a wholesale perspective), which are Items ABC, DEF and EFG. CustomerA's Store Number 1025 also sells three of our items, which are ABC, HIJ and XYZ.

Each of these items contains two import pieces of data, in regards to its relationship to its specific customer and store number, Point of Sale data and Inventory Data. Point of Sale data would be in the form of PosUnits, which would be the quantity of an item that were sold, and PosDollars, which would be the total Dollars of the item that were sold in that store (essentially the number of units times the price it was sold for). The Inventory Data would be in InventoryUnits, which is the quantity of an item that is in stock at a store. [one thing to note, I separated inventory and pos data into separate tables, because we don't always receive both pieces of data from every customer. Also inventory and POS data are generally analyzed separately].

So, back to my example, CustomerA's Store Number 1000, item ABC may have sold 100 units, which is $1245.00. CustomerA's Store Number 1025, may have sold only 10 units of the same item for $124.50.

Now if we go back to CustomerB, it just so happens this Customer also has an item named ABC that it sells in many of their stores. CustomerA's item ABC is a completely different product from CustomerB's item ABC. It's purely coincidental that they named them the same thing.

Let me add this last point of clarification, which I probably should have stated earlier. My perspective is as a wholesaler. When I say item, I'm speaking of the customers item number, not the wholesalers item number. There is a cross reference involved in getting to the wholesalers item and the customer may have more than one of their item number the reference the same wholesaler item number. I don't think it' necessary to delve into that, though.

Question #1: As part of the normalization rules, you should avoid to include redundant data in any table unless there are performance issues that require de-normalization . there are thousands of articles that will explain why avoiding redundancy.

As for Question #2: in the rules are only pick the columns that you need in your queries, if you need the Customer_ID pick it from where is cheaper for the database

Allow me to raise one more question

why do you have repeated Customer_ID in Stores and Item_level when you can join them thought the Main Inventory Data . this is another redundancy.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM