简体   繁体   中英

MySQL Database: How far to Normalize / Queries VS Join / Unique Index

Lately i found myself designing a database. The database is consisted of several tables (InnoDB) :

Table 1: Country ( id , country_name )

Table 2: City ( id, city_name , countryid )

Table 3: Users ( id , cityid , A , B, C, D, E )

On the Users table, A , B ,C , D and E are some characteristics of the user, where characteristic A if you combine it with cityid must be unique, that is why i created a unique index for these 2 columns:

CREATE UNIQUE INDEX idx_user ON Users(cityid , A);

The rest columns B,C,D and E are other user characteristics (for example hair color, height, weight, etc.), that as you understand, will be repeated on the table ( hair color = black, or weight = 75 kg).

At the same time countryid and cityid are configured as foreign keys on UPDATE and DELETE CASCADE.

Search will be based on cityid and A columns. A drop down menu to select the city (hence cityid ) and a text box to insert the characteristic A and then hit SEARCH button.

My questions are:

  1. On Users table, i have repeating data in the same column (columns B, C ,D and E ). This is against 2NF . Do i have to create a separate table for each of these columns and then assign a foreign key of each of these tables to Users table in order to achieve 2NF?

    Table B (id, Bchar)

    Table C (id, Cchar)

    Table D (id, Dchar)

    Table E (id, Echar)

    Users (id, cityid , A , Bid, Cid, Did, Eid)

  2. For the time i will not use columns B,C,D and E as search data, only display them after searching using cityid and A search. If (in the future) i decide that i need to display all results of Users that live in cityid and have black hair , what do i have to keep in mind now while designing the database?

  3. In one hand we have DML(INSERT, UPDATE, DELETE) and on the other hand quering (SELECT). DML will work faster on normalized DBs and quering on denormalized DBs. Is there a middle solution?

  4. Will UNIQUE INDEX created above , be enough to ensure uniqueness for the combination of the data in columns cityid and A ? Do i need to further restrict it using JavaScript or better PHP?

  5. Multiple Queries VS Joins : Normalizing the database will require multiple queries or a single query with joins. In the case where "The user searches for a user from Madrid with characteristic A":

    a) Multiple queries:

    i) Go to City table and find the id of Madrid (for example, id = 2 )

    ii) Given the Madrid id and the input for characteristic A, go to Users table and SELECT * FROM Users WHERE cityid="2" AND A="characteristic";

    b) INNER JOIN:

     i) SELECT City.city_name, Users.B, Users.C FROM City INNER JOIN Users ON Users.cityid = City.id; 

    Which one should i prefer?

Thanks in advance.

Your tables are already in 2NF.The condition for 2NF is there should be no partial dependency.For example lets take your users table and user-id is the primary key and another primary key more appropriate to call candidate key is (cityid,A) with which you can uniquely represent a row in the table.Your table is not in 2NF if cityid or A alone is enough to uniquely retrieve B,C,D or E but in your case one needs both (cityid,A) to retrieve a unique record and hence it's already normalized.

Note:

Your tables are not in 3NF.The condition for 3NF is no transitive dependency.Let's take the users table here userid is the primary key and you can get a unique (cityid,A) pair with that and in turn you can get a unique (B,C,D,E) record with (cityid,A) obtained from userid.In short if A->B and B->C indirectly A->C which is called transitive dependency and it's present in your user table and hence it's not a suitable candidate for 3NF.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM