简体   繁体   中英

Is This Mysql Database Normalization Form and Structure Design Correct?

I am trying to figure out what kind of normalization and structure to use for a database I am making. It is going to be a list of properties(building number street addresses, street names, cities, states, zipcodes, unit numbers).

From there, I was going to make a table with various info. Then I was going to have a intermediate table to join all the information and make the record. As far as I can tell, just about every column will by multi valued except unit number. So, I see the need for complete normalization:

Table building_number
---------------------
building_number_id int primary key auto index not null
buildind_namber tinyint

Table city
--------------------
city_id building_number_id int primary key auto index not null
city_name varchar(30)

Table state
--------------------
state_id building_number_id int primary key auto index not null
state_name varchar(30)

Table zip
---------------------
zip_id building_number_id int primary key auto index not null
zip_name varchar(30)

Table building_name
---------------------
building_name_id int primary key auto index not null
building_name varchar(50)

Table owner
---------------------
owner_id int primary key auto index not null
owner_name varchar(30)


Table info
----------------------
info_id int primary key auto index not null
rent tinyint
condition varchar(10)
comment varchar(1000)

Intermediate table
--------------------------
building_number_id int 
street_id int 
city_id int 
state_id int 
building_name_id 
owner_id 
info_id 
(all these keys are foreign keys referencing their respected tables/primary keys)

I will be creating a html search text box which will take dynamic input and pull up queries based on whatever is give...complete exact address, street name, or building number street name city, etc. I haven't developed my search mysql algorithm yet. I'm just at the beginning stage of creating my database.

I will be using innodb engine and b-tree indexing. I will index every column except comment since I will be doing these dynamic input searches(like google).

I am doing this for myself as a hobbyist. Because of this, I prefer to do this by hand from scratch rather than using some framework or plug ins.

For what I am doing, is this database design and normalization correct?

When you're creating tables, you should be thinking first in terms of entities , and in general terms, an entity is a tangible thing.

Examples of tangible things are: buildings, owners, contacts, cities, countries, time zones.

On the other hand, there are things that are not entities, but instead descriptors of entities.

Examples of descriptors are: height, weight, door number, and price.

Descriptors are generally attributes of entities. If it is not possible to enumerate all possible descriptors in advance, these should probably not be in a table.

Cases where you would want a look-up table for a descriptor are generally where you're constrained in the types of values you can accept. For example, "shoe size" might seem open-ended, but maybe you only manufacture certain sizes, so a free-form input field is not practical. On the other hand, "height" is better stored as a value with a pre-defined set of units instead of having a look-up table of all possible heights.

In your case, you need an "address" entity with a number of fields that describe it. Things like "building number" should be a free-form input field. "Building A", "82 1/2", "107B", "3.7", "4/9" and "44-290" are all valid building numbers. You should just accept a string.

Likewise, street names are hardly a thing you can qualify. Is "Green Way Street" the same as "Green Way St." or "Greenway St."? Does it matter? Probably not, as it's just a descriptor. You have no way of verifying these, and linking them together is almost impossible, there's way too much massaging required to get it to work on a large scale.

Also keep in mind that some places need two, three, four, or even five lines of address information to identify a location. The United Kingdom is one of the worst offenders here, where a formal address will include all sorts of information.

What you should probably do is design a table like "addresses" with fields: address1 , address2 , address3 , address4 , address5 , city , region , country , postal_code . With that you can cover most anything they'll throw at you. Look at the kind of data Google Maps returns for examples.

You seem to be hinting at some kind of one-to-many structure in your question, where an address could have multiple building names or numbers. Without some kind of sequence indicator, you'll have no way of knowing which of these associated records is first. That complicates things significantly.

When worrying about normalization, start with the simplest thing that works, and fix any obvious mistakes. Unless you have massive amounts of data to deal with, you can usually adjust your schema fairly easily if you haven't over-done it with normalization.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM