简体   繁体   English

标记数据库的关系数据的优势

[英]Advantage of relational data of a tagging database

Imagine a simple database (MySQL) with objects, and tags that could be attached to the objects. 想象一下一个简单的数据库(MySQL),其中包含对象以及可以附加到对象的标签。 What are the advantages of storing the tags in a seperate table (so you have the tables tag, object, tag_has_object) over making two tables (tag and object, where tags are not saved as number but directly as a string). 将标签存储在单独的表(因此具有表标签,对象,tag_has_object)中,与制作两个表(标签和对象,其中标签不保存为数字而是直接保存为字符串)相比,有什么优势?

While I'm used to making everything relational, someone proposed doing it the second way (two tables), and I can't come up with any counter arguments. 虽然我习惯于使一切都具有关系性,但有人提议用第二种方式(两个表)进行处理,但我无法提出任何反论点。 Is there an advantage of one over the other? 有一个相对于另一个的优势吗?

The main difference is how you intend to use it. 主要区别在于您打算如何使用它。

Tags as strings: 标记为字符串:

  • Easy to insert 易于插入
  • Might end up with slow select queries 最终可能会导致选择查询缓慢
  • Will use more data 将使用更多数据

Tags as table: 标签如表:

  • More difficult inserts 较难插入
  • Faster select queries 更快的选择查询
  • Will use less data 将使用更少的数据

So if your app will not be very big there is not problem using strings. 因此,如果您的应用不会很大,那么使用字符串就不会有问题。

The three table option implies there is a pre-defined list of tags which you associate with objects. 三个表选项意味着存在与对象关联的预定义标签列表。 The two table option implies the tag is free-text and could be any value. “两个表”选项暗示该标记是自由文本,并且可以是任何值。

Whether, in the three table option, you choose to add an additional surrogate numeric key to the tags table and use this a reference in the linking table, or use the tag itself as the key and reference this is a pragmatic choice based on the criteria of familiarity, irreducibility, stability and simplicity. 在“三表”选项中,您是选择向标签表添加其他代理数字键并在链接表中使用它作为引用,还是将标签本身用作键并参考这是基于条件的务实选择熟悉,不可还原,稳定和简单。 Considering all of these you would need to decide whether a surrogate key is suitable in your specific situation. 考虑所有这些因素后,您需要确定代理密钥是否适合您的特定情况。

Some things to consider 要考虑的一些事情

With only a natural key of the tag itself. 仅带有标签本身的自然键。

  • Do not need to join to tags table to get tag value (familiarity) 无需加入标签表即可获取标签值(熟悉度)
  • One, single column candidate key on the tags table (simplicity) 标签表上的一个单列候选键(简单)

With an additional surrogate key: 使用附加的代理密钥:

  • Changes to a tag do not need to be cascaded to the referencing columns (stability) 标记更改无需级联到引用列(稳定性)

It's worth thinking about the ways in which you will use the data. 值得考虑使用数据的方式。

The obvious scenarios are: 显而易见的场景是:

  • insert a new object, and associate it with tags. 插入一个新对象,并将其与标签关联。 Much easier with two tables, as long as you don't care about validation (is "article" the same as "Article"? Might "articel" a typo?), or whether the tag already exists. 只要您不关心验证(“ article”是否与“ Article”相同?或者“ articel”有错字?),或者标签是否已经存在,使用两个表都容易得多。
  • show all objects matching a given tag. 显示与给定标签匹配的所有对象。 Much easier (and probably faster) with 3 tables, because you're only going to be comparing strings on the "tag" table, and then joining on keys (presumably integers). 使用3个表要容易得多(可能更快),因为您只需要比较“ tag”表上的字符串,然后联接键(可能是整数)。 This is especially important if you support wild cards or other "search" like features. 如果您支持通配符或其他类似“搜索”的功能,那么这尤其重要。
  • show a "tag cloud". 显示“标签云”。 A little easier (and probably faster) using 3 tables - again down to the string manipulation you may have to do. 使用3个表稍微容易一些(可能更快)-再次取决于您可能要做的字符串操作。

In general, I'd accept a little extra pain when inserting the record for fast retrieval, because in most applications, you do more reads than writes. 通常,在插入记录以进行快速检索时,我会承受一些额外的痛苦,因为在大多数应用程序中,读操作比写操作多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM