简体   繁体   中英

Why two primary keys in a table is not allowed?

I want to understand the reasoning behind why two primary keys are not allowed in same table by grammar of MySql. I understand composite key and this is not related to that.

I could not understand why two separate primary key in same table is not allowed. But we can have one column as primary key and other column as unique and not null. Which practically acts as having two different primary keys as we can distinctly identify a row using any of these keys. Why the first one is not allowed but the later is?

I was trying to figure out the logic but stumbled upon Rule 2 of Codd's 12 rules which says

Each and every datum (atomic value) in a relational data base is guaranteed to be logically accessible by resorting to a combination of table name, primary key value and column name.

Does combination of table name, primary key value and column should be unique for a single datum value or can we have multiple combinations which guarantees access? Why or why not?

Edit: Since the question was identified as possible duplicate, I will explain below how it is different Other similar question was

Can I have multiple primary keys in a single table?

The ans is NO .

My question is why ? What is the reasoning?

What is that Codd intends to say by rule no 2?

What problems would have occurred if multiple separate primary keys were allowed?

You can have as many UNIQUE KEY constraints as your database system allows, and many relational purists now see it as a mistake to elevate one of these keys and anoint it as PRIMARY.

Logically, both unique key constraints and primary keys fulfil the same purposes - defining a subset of columns by which rows may be uniquely identified. They also should be eligible as targets for foreign key constraints.

Certain defaults (such as nullability) are automatically applied by primary key constraints but there's no reason you cannot apply the same constraints manually to columns included in unique key constraints.

'Primary key' can be traced back to Ted Codd , inventor of the relational model on which SQL is loosely based. Codd clearly realised that when a relation is stored in a DBMS as a variable (relvar, r-table, table, whatever) it could have multiple candidate keys that should be enforced. He originally thought that nominating one or more of the keys as 'primary' might be beneficial. Since then (late 1960s!) RM thinking has moved on and the idea of a 'primary' key is no longer considered potentially beneficial. A similar thing can be said of nulls: Codd went on to propose two types of null ie four valued logic. What can I say? The man was a genius but not infallible!

Sadly, many of the early RM thinking made it into the early SQL implementations and later enshrined in the SQL standards. And because of the "shackles of compatibility", they will never be removed from the SQL standards. But these things were not properly implemented in SQL. For example, Codd didn't specify a relvar should be restricted to one primary key but on implementation in SQL one per table is now the rule. Why? Misunderstanding of Codd's original paper by early SQL implementers? Canny explicit design by relational theorists? I assume cock-up over conspiracy!

The differences between a PK and an equivalent NOT NULL UNIQUE constraint are minor and not terribly useful. For example, when specifying a PK in SQL, you must specify the referenced table but you can omit the referencing columns. When specifying columns, they must be "equal to the set of column names in the unique columns of a unique constraint of the referenced table" (ie either the PK or any NOT NULL UNIQUE constraint). When omitting columns from the definition, then the referenced table's PK is assumed. If a table were allowed to have more than one PK then this functionality really wouldn't work. As I say, not a big deal but maybe important to some users.

A primary key has three properties:

  • The combination of values is unique.
  • Each value in the key is NOT NULL .
  • There is only one primary key per table.

A table can have as may key or key combinations that meeting the first two conditions. You are asking why the third definitions is needed.

It is a matter of convenience for defining relational logic that each table has a "named" method of accessing a particular row. For instance, this gives guidance on how best to define foreign key relationships (use the primary key). It is simply the definition.

Some databases uses the primary key for clustering the data -- that is, sorting the rows on the data pages. Given that the data can only be sorted one way, this offers a practical reason why.

From a normalization perspective

There's a lot of computer science behind database algorithms, and like any science has to make assumptions, and one of them is that data are stored in a form that is normalized . Everything in the row must depend on the key (1st normal form), the whole key (2nd normal form), and nothing but the key (3rd normal form). When you stray away from this, you will get less predictable and usually poor performance.

A row can have any number of candidate keys , each of which could satisfy the criteria for being a primary key. And I suppose you could call the others "secondary" or "tertiary keys." Nobody does that, really. If another value is needed, eg a natural key , it is typically set up as an attribute, not a key.

That being said, you could take any two columns and call them a compound key , and also declare that key the primary key. That way, there are indeed two columns participating in the primary key relationship. But that leads to problems in performance.

From a performance perspective

One key is necessary and sufficient to achieve a normalized schema. More than one key could be set up, but they would contain redundant data-- if you know one, you know the other, if you know who to ask-- and be in violation of 2nd normal form. It also means each row would take up more space that it really needs to. A bigger row means fewer rows per page meaning poorer performance, especially considering that the primary key is used as a clustering key and is contained in the leaf pages of all in indexes throughout the database. Why spend the bytes on something you already know?

Typical practice

Store any additional "keys" as attributes in the row where the entity is defined. For example, you could store social security number as an attribute of the Employee table where EmployeeID is the primary (and possibly surrogate) key. Whenever you need it, join to the Employee table. (And by the way you might want to tighten the column level permissions of the SSN.) Don't store it in multiple places; there's no need to.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM