should I have 2 identical tables

Question

I have the need for 2 different types of joins on the same tables (lets say ADDRESS and USER ). I can either make 2 tables ( BILLING_ADDRESS and SHIPPING_ADDRESS ) that both have 3 columns ( ID , USER_ID , ADDRESS_ID ), or I can make a single table( CUSTOMER_ADDRESS ) with a type column ( ID , USER_ID , ADDRESS_ID , ADDRESS_TYPE ).

For DRY coding practices I'm thinking just the single table, but that means when I compile the 2 lists I would have to do full table scans twice.

select address.* from customer_addresses, address where user_id = 1 and address_type = 'Billing'

and

select address.* from customer_addresses, address where user_id = 1 and address_type = 'Shipping'

Both rely on full table scans of the customer_addresses table.

If we have 1000 customer addresses that means 2000 records have been scanned to find all the addresses for that customer.

If I do the 2 different tables, then only 1000 customer addresses are scanned, because the shipping_addresses table only holds 800 address/customer records, and the billing_addresses table hold the other 200.

So for performance I would have to say the 2 different tables. For DRY I would have to go with the single table. What are the industry thoughts on this?

Answer 1

A shipping address and a billing address might be different things. For instance, a billing address might be a PO Box, but a shipping address often cannot be. Similarly, a shipping address might include other information, such as a contact name, contact phone, and drop-off instructions. I just mention this because you need to decide whether the differences are material enough to create a separate entity, or just to have a few separate fields in an address table.

This is just to let you know that there might be other fields.

I think this is the query you suggest (with the join syntax fixed):

select a.*
from customer_addresses ca join
     address a 
     on ca.address_id = a.address_id
where ca.user_id = 1 and ca.address_type = 'Billing';

This does not require a full table scan with an intelligent data design. As Barmar points out in a comment, you should have a proper index on these tables. In this case, the indexes you want are customer_address(user_id, address_type) and address(address_id) . If a database only did full table scans for SELECT queries, SQL would be a much less useful language and probably not used anywhere.

Answer 2

A single table allows for more flexibility. For instance, in the future you might decide to allow a customer to store alternate shipping addresses, and choose one when placing an order. You could then add address_type = 'Alternate Shipping Address' , you wouldn't have to add another whole table.

There should be little performance impact of this design. An index on the user_id will narrow down the query to just a few rows that need to be scanned for the desired address type.

Answer 3

a single table is much better if it meet all your needs, in this case the both scenarios you mentions will have redundant data see the normalization for more info, in this case I think if you have one table ADDRESS (ID, USER_ID, SHIPPING_ADDRESS_ID, BILLING_ADDRESS_ID). is much netter than having tow table for address, in this scenario you can't reach forth normal form

should I have 2 identical tables

Question

3 answers

solution1
2 ACCPTED 2015-02-07 16:11:08

solution2
0 2015-02-07 16:08:12

solution3
0 2015-02-07 16:15:05

should I have 2 identical tables

Question

3 answers

solution1 2 ACCPTED 2015-02-07 16:11:08

solution2 0 2015-02-07 16:08:12

solution3 0 2015-02-07 16:15:05

solution1
2 ACCPTED 2015-02-07 16:11:08

solution2
0 2015-02-07 16:08:12

solution3
0 2015-02-07 16:15:05