简体   繁体   中英

Postgres HStore vs HABTM

I am building an app that has and model that can be tagged with entries from another model, similar to the tagging function of Stackoverflow.

For example:

class Question < ActiveRecord::Base
    has_and_belongs_to_many :tags
end

class Tag < ActiveRecord::Base
    has_and_belongs_to_many :questions
end

I am debating between just setting up a has_and_belongs_to_many relationship with a join table, or adding the tags to a hash using Postgres' hstore feature.

Looking for anyone that has had a similar experience that can speak to performance differences, obstacles, or anything else that should persuade me one way or another.

EDIT:

I think I should also mention that this will be a API that will be using an AngularJS frontend.

You are describing the topic of a great debate:) Normalization vs denormalization. Using many to many allows you to do nice queries such as "how many people use a certain tag" in a very simple way. HStore is very nice as well but you end up with thousands of the same tags everywhere. I use both approaches in different projects but the real problem comes when you decide one day to move your database. With Hstore you will be stuck to postgresql or have to rewrite your code. If super high speed is important as well as querying different ways and you often want to load a user record in one fellow swoop as fast as possible and show all used tags I normally do both: create a many to many relationship as tags are normally also connected to more objects like user has many tags from tags table and tags are connected to let's say brands which are connected to products and so on.

Then I create an additional field with hstore or json objects on the user table which adds every tag or removes it when the many to many relationship is destroyed.

To give you an example: in on of my projects I have companies (almost 10 million) who are interested in certain keywords and their ranking on google. This table has millions of rows but connected only to 2 million keywords which are connected to search results. This way I can quickly query which result is searched for by how many people and who they are.

If a customer opens their key word search page I load their keywords from a text column with json which is faster than going through the table.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM