简体   繁体   English

2 数据库设计问题。 层次树

[英]2 Database Design Questions. hierarchy tree

1.) I have a DB where each entry represents a task. 1.)我有一个数据库,其中每个条目代表一个任务。 And out of several dozens or even a hundred of task, there will be a special task (which is a milestone)而在几十甚至上百个任务中,会有一个特殊的任务(这是一个里程碑)
So, in this case, I have very few entries that requires an extra field to separate them from the majority.因此,在这种情况下,我只有很少的条目需要一个额外的字段来将它们与大多数条目分开。

I don't want to create a second table, because this is the only fields that makes these milestone stone special, they share a lot of other fields with regular tasks entries.我不想创建第二个表,因为这是使这些里程碑特别的唯一字段,它们与常规任务条目共享许多其他字段。

Should I create another field just to hold a few TRUEs while the rest are FALSE by defaul我是否应该创建另一个字段来保存一些 TRUE 而 rest 默认为 FALSE

2.) For each of those tasks, it has a variable number of performers (depending on user input) (To further things, each performer has multiple sub-performers of its own.) So I essentially am using a DB to describe a TREE structure.The way I have it now is, I will have 5 copies of the same task info if there are 5 performers, and occupy 5 entries. 2.)对于这些任务中的每一个,它都有可变数量的执行者(取决于用户输入)(更进一步,每个执行者都有自己的多个子执行者。)所以我本质上是使用数据库来描述一棵树结构。我现在的方式是,如果有 5 个执行者,我将有 5 个相同任务信息的副本,并占用 5 个条目。 Is this the way to go if I'm not going to have more than 10,000 entries (incl. copies) in my DB如果我的数据库中的条目(包括副本)不超过 10,000 个,这是通往 go 的方式吗

Thank you谢谢

This should clarify it这应该澄清它

  1. Task1 (this is a milestone task) Task1(这是一个里程碑任务)

    • performer1表演者1
      • sub-performer ID=21子演员ID=21
      • sub-performer ID=542子演员ID=542
    • performer2表演者2
  2. Task2 (this is not a milestone task) Task2(这不是里程碑任务)

    • performer2表演者2
      • sub-performer ID=231子演员ID=231

Subperformer and performer are completely different group.次表演者和表演者是完全不同的群体。 No overlap at all.完全没有重叠。 Subperformer are the group that provide inputs to performer, so performer can complete the task they're assigned to.子执行者是向执行者提供输入的组,因此执行者可以完成分配给他们的任务。

I am not sure if this is what you want:我不确定这是否是您想要的:

tblTask with columns taskID, isMilestone, and everything you need. tblTask 具有列 taskID、isMilestone 以及您需要的所有内容。

tblAgent with columns agentID and everything you need (these will be the (sub-)performers). tblAgent 列 agentID 和您需要的一切(这些将是(子)表演者)。

tblPerformance with columns fk_agentID, fk_task tblPerformance 与列 fk_agentID、fk_task

tblSubperformance with columns fk_agentID_performer, fk_agentID_subperformer tblSubperformance 与列 fk_agentID_performer、fk_agentID_subperformer

Being fk_ foreign keys referencing被 fk_ 外键引用

fk_agent -> tblAgent.agentID
fk_task -> tblTask.taskID
fk_agentID_performer -> tblAgent.agentID
fk_agentID_subperformer -> tblAgent.agentID

1) yes create a boolean flag. 1) 是的,创建一个 boolean 标志。

2) no. 2)没有。 if you have duplicate data you have a problem.如果你有重复的数据,你就有问题。 you need to normalize你需要标准化

You're really not exploiting the relational nature of databases.你真的没有利用数据库的关系性质。 The nice way to do it is:这样做的好方法是:

  • Have a table of tasks (with unique ids, without the extra milestone field, without the preformers)有一个任务表(具有唯一的 id,没有额外的里程碑字段,没有 preformer)
  • Have a table of milestones with two columns: the task id and the special milestone field -- only milestones will appear in this table有一个包含两列的里程碑表:任务 id 和特殊里程碑字段 - 只有里程碑才会出现在此表中
  • Have a table with two columns: task id and performer有一个包含两列的表:任务 ID 和执行者
  • Have a table with two columns: performer and sub-performer有一个包含两列的表:performer 和 sub-performer

  • If a performer can have multiple fields, use a prerformer id in the tables above and have a table with performer id and other fields如果表演者可以有多个字段,请使用上表中的表演者 ID,并有一个包含表演者 ID 和其他字段的表

Re: Comment回复:评论

I have read that normalization can reduce DB efficiency, that's why I combine them all.我读过规范化会降低数据库效率,这就是我将它们全部结合起来的原因。

Where?在哪里? It's a pretty strange claim.这是一个很奇怪的说法。

For the table that contain taskid and performer (the 3rd on your list) Would it be like, if task 143 needs Staff A, B, C.对于包含 taskid 和执行者(列表中的第三个)的表,如果任务 143 需要员工 A、B、C,会是这样吗? In DB, (row 1| 143 | A) (row 2| 143 | B) (row 3| 143 | C) Don't you still have redundancy?在 DB 中, (row 1| 143 | A) (row 2| 143 | B) (row 3| 143 | C) 你还没有冗余吗?

The repetition in the third table isn't a redundancy problem because you aren't replicating any information: the information in the table is about relationships and there are three relationships in three rows.第三个表中的重复不是冗余问题,因为您没有复制任何信息:表中的信息是关于关系的,三行中有三个关系。

A redundancy problem appears when you have a setup like yours, were, let's say task 143 has a completion_date "May 31, 2011" then your table would look like:当您有像您这样的设置时会出现冗余问题,假设任务 143 的完成日期为“2011 年 5 月 31 日”,那么您的表将如下所示:

task_id  completion_date  performer
143      May 31, 2011     A
143      May 31, 2011     B
143      May 31, 2011     C

Now let's say I want to change the completion-date for task 143. In your setup I have to change it in all three rows, and what's worse, if someone does something wrong you could get an inconsistent table like:现在假设我想更改任务 143 的completion-date 。在您的设置中,我必须在所有三行中更改它,更糟糕的是,如果有人做错了什么,您可能会得到一个不一致的表,例如:

task_id  completion_date  performer
143      May 31, 2011     A
143      May 12, 2011     B
143      May 31, 2101     C

And now you don't know which is the right completion_date , When you normalize, you only have one row in one table to change the date.而现在你不知道哪个是正确的completion_date ,当你规范化时,你只有一张表中的一行来更改日期。 and your database is never inconsistent like that.而且您的数据库永远不会像那样不一致。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM