简体   繁体   English

Postgres HStore与HABTM

[英]Postgres HStore vs HABTM

I am building an app that has and model that can be tagged with entries from another model, similar to the tagging function of Stackoverflow. 我正在构建一个应用程序,该应用程序的模型可以用另一个模型的条目进行标记,类似于Stackoverflow的标记功能。

For example: 例如:

class Question < ActiveRecord::Base
    has_and_belongs_to_many :tags
end

class Tag < ActiveRecord::Base
    has_and_belongs_to_many :questions
end

I am debating between just setting up a has_and_belongs_to_many relationship with a join table, or adding the tags to a hash using Postgres' hstore feature. 我正在争论只是通过联接表设置has_and_belongs_to_many关系,还是使用Postgres的hstore功能将标签添加到哈希中。

Looking for anyone that has had a similar experience that can speak to performance differences, obstacles, or anything else that should persuade me one way or another. 寻找具有相似经验的人,他们可以说出性能差异,障碍或任何其他可以说服我的方式。

EDIT: 编辑:

I think I should also mention that this will be a API that will be using an AngularJS frontend. 我想我还应该提到,这将是一个使用AngularJS前端的API。

You are describing the topic of a great debate:) Normalization vs denormalization. 您正在描述一个大辩论的主题:)规范化与非规范化。 Using many to many allows you to do nice queries such as "how many people use a certain tag" in a very simple way. 使用多对多可以使您以非常简单的方式进行出色的查询,例如“有多少人使用某个标签”。 HStore is very nice as well but you end up with thousands of the same tags everywhere. HStore也非常好,但是到处都会有成千上万个相同的标签。 I use both approaches in different projects but the real problem comes when you decide one day to move your database. 我在不同的项目中都使用了这两种方法,但是真正的问题出在当您决定移动数据库的一天时。 With Hstore you will be stuck to postgresql or have to rewrite your code. 使用Hstore时,您将陷于postgresql或不得不重写代码。 If super high speed is important as well as querying different ways and you often want to load a user record in one fellow swoop as fast as possible and show all used tags I normally do both: create a many to many relationship as tags are normally also connected to more objects like user has many tags from tags table and tags are connected to let's say brands which are connected to products and so on. 如果超高速非常重要,而且要查询不同的方式,并且您通常希望尽快将用户记录加载到其他同伴中,并显示所有使用的标签,那么我通常都会做这两个事情:创建多对多关系,因为标签也通常也是连接到更多对象的用户,例如用户在标签表中有许多标签,并且标签已连接到与产品相关的品牌等。

Then I create an additional field with hstore or json objects on the user table which adds every tag or removes it when the many to many relationship is destroyed. 然后,我在用户表上创建一个带有hstore或json对象的附加字段,该字段将添加每个标签或在多对多关系被破坏时将其删除。

To give you an example: in on of my projects I have companies (almost 10 million) who are interested in certain keywords and their ranking on google. 举个例子:在我的项目中,我有一些公司(将近1000万个)对某些关键字及其在Google上的排名感兴趣。 This table has millions of rows but connected only to 2 million keywords which are connected to search results. 该表具有数百万行,但仅与连接搜索结果的200万个关键字相关。 This way I can quickly query which result is searched for by how many people and who they are. 这样,我可以快速查询按多少人及其身份搜索的结果。

If a customer opens their key word search page I load their keywords from a text column with json which is faster than going through the table. 如果客户打开他们的关键字搜索页面,我将从带有json的文本列中加载他们的关键字,这比通过表格快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM