简体   繁体   English

正确命名表的id字段

[英]naming the id field of a table properly

I am currently reading the book "SQL Programming Style" wrote by Joe Celko. 我目前正在阅读Joe Celko撰写的“SQL Programming Style”一书。

In the first chapter, at the paragraph "Develop Standardized Postfixes" he states for the id column : 在第一章中,在“开发标准化后缀”一节中,他列出了id列:

"_id" = identifier. “_id”=标识符。 It is unique in the schema and refers to one entity anywhere it appears in the schema. 它在模式中是唯一的,并且在模式中出现的任何位置引用一个实体。 Never user ">table_name<_id" 永远不要用户“> table_name <_id”

Few pages later he states 他说,几页之后

Do not use an underscore as the first or last letter in a name. 不要使用下划线作为名称中的第一个或最后一个字母。 It looks like the name is missing another component. 看起来这个名字缺少另一个组件。

He deprecated "id" as column name. 他弃用了“id”作为列名。

So I would like to know how you guys name the id column ? 所以我想知道你们如何命名id列?

I know that most people might think what the point of this question, but I am looking on standardizing my data model, following industry standards and ISO standards as much as I can. 我知道大多数人可能会想到这个问题的重点,但我正在考虑标准化我的数据模型,尽可能遵循行业标准和ISO标准。

I also deprecate the use of "Id" as a column name, even though it has become very widespread. 我也不赞成使用“Id”作为列名,尽管它已经变得非常普遍。 "EmployeeId" is longer than "Id", but it is more descriptive. “EmployeeId”比“Id”长,但它更具描述性。 It also allows a foreign key to generally have the same name as the column to which it refers. 它还允许外键通常具有与其引用的列相同的名称。 This is enormously helpful when control over the database passes from one person to the next. 当对数据库的控制从一个人传递到另一个人时,这非常有用。

There is an exception to the above. 上述情况有例外。 It's possible to have two foreign keys in the same table that both refer to the same key. 在同一个表中可以有两个外键,它们都引用相同的键。 It's also possible to have a reflexive foreign key that refers to the key in a different row of the same table where it appears. 也有可能有一个自反外键引用它出现在同一个表的另一行中的键。

Let me give an example of a reflexive key. 让我举一个反身键的例子。 You have a table of employees, with key EmployeeId. 您有一个员工表,其中包含关键的EmployeeId。 You have another column, called SupervisorId, that records the relationship between a supervisor and several subordintes. 您有另一个名为SupervisorId的列,它记录了主管和多个下级之间的关系。 The name of the foreign key in this case names the role, and not the entity. 在这种情况下,外键的名称命名角色,而不是实体。

As an alternative, it's possible to use user defined domains to document the fact that two columns refer to the same thing. 作为替代方案,可以使用用户定义的域来记录两列引用相同内容的事实。 Again, this is most useful when the fundamental meaning of the data has to be communicated to someone new. 同样,当数据的基本含义必须传达给新的人时,这是最有用的。

The use of underscore as an internal visual separator inside a symbol is a completely separable issue. 在符号中使用下划线作为内部可视分隔符是一个完全可分的问题。 Camelcasing has become more widespread than underscore, and there are even systems where underscore is not allowed as a symbol constituent. Camelcasing比下划线更广泛,甚至有一些系统不允许下划线作为符号成分。

Above all, keep it consistent. 最重要的是,保持一致。 If you use arbitrary, capricious, and contradictory naming conventions, you'll eventually confuse even your self. 如果你使用任意的,反复无常的,矛盾的命名约定,你最终会迷惑自己。

I think it's a good question. 我认为这是一个很好的问题。 Do what looks good to you, and always do that, every time. 做什么看起来对你好,并且每次都这样做。 Then you'll be fine. 那你就没事了。

I use the tablename + 'id' model: UserId, PersonId etc. 我使用tablename +'id'模型:UserId,PersonId等。

For Table IDs I always use tablename + ID. 对于表ID,我总是使用tablename + ID。

The reason for this is to avoid ambiguous column names in Queries when it is a 1 to 1 mapping 这样做的原因是当它是1对1映射时,避免查询中的模糊列名称

Sometimes I quickly write up sql to test like this 有时我会快速编写sql来测试这样的

Select
  * 
FROM table1
Inner join table2 on table1ID = table2ID

If I didnt use tablename in the ID column then this would throw an error (forcing me to use aliases on the tables) 如果我没有在ID列中使用tablename,那么这会抛出一个错误(强迫我在表上使用别名)

Select
  * 
FROM table1
Inner join table2 on ID = ID

Also another good reason to use the table name, in general testing queries to see what data exists use the "*" to select columns. 另外一个很好的理由使用表名,一般测试查询以查看存在哪些数据使用“*”来选择列。 If you do a join and Select *, sometimes it is difficult to understand what ID came from what table, especially if you are returning a large number of columns from more than 2 tables 如果你进行连接和选择*,有时很难理解哪个ID来自哪个表,特别是如果你从2个以上的表返回大量列

I always advocate for globally unique TABLENAME_ID. 我一直倡导全球唯一的TABLENAME_ID。 On that note, I strongly encourage table names which fully describe their context, so there is never any ambiguity as to their application when foreign references are made. 在这方面,我强烈鼓励完整描述其上下文的表名,因此在进行外部引用时,它们的应用程序永远不会有任何歧义。

Rather than share my opinions on naming standards, I'll attempt to answer your question ;) 我会尝试回答你的问题,而不是分享我对命名标准的看法;)

I think the point Celko is making is that student_ID in a table of students is a code smell ie it could be that the designer's style is to always adds an ID column, probably an auto-increment column, to every table they create in the physical model (even when there is no such column in the logical model) with the intention of using these ID columns for foreign keys. 我认为Celko的观点是学生表中的student_ID是代码气味,即设计师的风格可能是总是在他们在物理中创建的每个表中添加一个ID列,可能是一个自动增量列。 model(即使在逻辑模型中没有这样的列),打算将这些ID列用于外键。 In other words, Celko does not want you to always use surrogate key, rather he wants you to use natural keys where appropriate. 换句话说,Celko不希望你总是使用代理键,而是希望你在适当的时候使用自然键。

If you read on to section 1.2.5 (p14-15) and follow his rules for table names, you'll discover why table name + _ID an unlikely occurrence: 如果您继续阅读第1.2.5节(第14-15页)并遵循他的表名规则,您将发现为什么表名+ _ID不太可能发生:

if I cannot find an industry standard (name), I would look for a collective or class name... Exception: use a singular name if the table actually has one and only one row in it. 如果我找不到行业标准(名称),我会寻找集体或类名...例外:如果表中实际只有一行且只有一行,则使用单数名称。

So, for example, if you had a table containing student data it may be called Students rather than Student but more likely to be Enrolment (or similar). 因此,例如,如果您有一个包含学生数据的表,它可能被称为学生而不是学生,但更可能是注册(或类似)。 And a table containing one and only one row is unlikely to need an _ID column. 并且包含一行且仅包含一行的表不太可能需要_ID列。

I suppose there are nouns for whom the plural is the same as the singular so maybe Sheep_ID is acceptable (but only in absence of an industry standard ovine identifier, of course!) 我想有复数与单数相同的名词所以也许Sheep_ID是可以接受的(但是当然没有行业标准的羊标识符!)

Also consider the rule 1.3.2. 还要考虑规则1.3.2。 (p19) Avoid Names That Change From Place to Place eg the same domain referred to in the Students table as ID and in other tables as student_ID. (p19)避免从一个地方更改为地方的名称,例如,在Students表中将相同的域称为ID,在其他表中称为student_ID。 It is unlikely that there will only be one element named _ID in the entire schema! 在整个模式中不太可能只有一个名为_ID的元素!

ID as a column name is hard to maintain and in my opinion can more easily lead to mistakes in joins. 作为列名的ID很难维护,在我看来,更容易导致连接错误。

Suppose for instance you always used ID as a column name in every table. 例如,假设您始终在每个表中使用ID作为列名。

Now suppose you need to join to six of those tables. 现在假设您需要加入其中六个表。 And being a typical person, you copy the first joins and change the table names. 作为一个典型的人,您复制第一个连接并更改表名。 If you miss one, and you use id you will get a query that runs and gives the wrong anaswer. 如果你错过了一个,并且你使用id,你将得到一个运行的查询并给出错误的anaswer。 If you use tablenameId you will get a syntax error. 如果使用tablenameId,则会出现语法错误。 see the following code for an example: 请参阅以下代码以获取示例:

create table #test1 (id int identity, test varchar(10))
create table #test2 (id int identity, test varchar(10))
create table #test3 (id int identity, test varchar(10))

insert #test1
values ('hi')
insert #test1
values ('hello')
insert #test2
values ('hi there')
insert #test3
values ('hello')
insert #test3
values ('hi')
select * 
from #test1 t1
join #test2 t2
    on t1.id = t2.id
join #test3  t3
    on t1.id = t2.id    
select * 
from #test1 t1
join #test2 t2
    on t1.id = t2.id
join #test3  t3
    on t1.id = t3.id        

Drop table #test1
drop table #test2
drop table #test3   
Go

create table #test1 (t1id int identity, test varchar(10))
create table #test2 (t2id int identity, test varchar(10))
create table #test3 (t3id int identity, test varchar(10))   


    insert #test1
    values ('hi')
    insert #test1
    values ('hello')
    insert #test2
    values ('hi there')
    insert #test3
    values ('hello')
    insert #test3
    values ('hi')

select * 
from #test1 t1
join #test2 t2
    on t1.t1id = t2.t2id
join #test3 t3
    on t1.t1id = t3.t3id    

select * 
from #test1 t1
join #test2 t2
    on t1.t1id = t2.t2id
join #test3 t3
    on t1.t1id = t2.t3id    

    Drop table #test1
    drop table #test2
    drop table #test3   

Another thing about using tablenameId is that when you want the actual id from several tables in a complex reporting query, you don't have to create aliases in order to see which id came from where (and to make the reporting application happy as most of them inist on unique fieldnames for a report). 使用tablenameId的另一个问题是,当您想要复杂报告查询中的多个表的实际ID时,您不必创建别名以查看哪个ID来自哪里(并使报告应用程序满意为止它们在报告的唯一字段名称中存在)。

Wow, I was going to write "I always use TablenameID but everyone else in the world disagrees with me". 哇,我打算写“我总是使用TablenameID,但世界上其他人都不同意我”。 However, it looks like everyone here agrees with me. 但是,看起来这里的每个人都同意我的观点。

That is, of course, when I use a surrogate integer ID in the table. 当然,当我在表中使用代理整数ID时。 If there's a natural primary key I use that instead. 如果有一个自然的主键,我会使用它。

In my database: 在我的数据库中:

For a foreign key ID, I use the singular version of the foreign table name + "Id". 对于外键ID,我使用外部表名+“Id”的单数形式。 I use the capital I, lower d as it is a standard ingrained in me by FX cop. 我使用首都I,因为它是FX警察根深蒂固的标准。

For auto incrementing identities I often use "SequenceId" 对于自动递增身份我经常使用“SequenceId”

In my data layer: 在我的数据层中:

I use the name of the object + "Id", following best practice standards for "Id" 我使用对象的名称+“Id”,遵循“Id”的最佳实践标准

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM