简体   繁体   中英

ActiveRecord query array intersection?

I'm trying to figure out the count of certain types of articles. I have a very inefficient query:

Article.where(status: 'Finished').select{|x| x.tags & Article::EXPERT_TAGS}.size

In my quest to be a better programmer, I'm wondering how to make this a faster query. tags is an array of strings in Article , and Article::EXPERT_TAGS is another array of strings. I want to find the intersection of the arrays, and get the resulting record count.

EDIT: Article::EXPERT_TAGS and article.tags are defined as Mongo arrays. These arrays hold strings, and I believe they are serialized strings. For example: Article.first.tags = ["Guest Writer", "News Article", "Press Release"] . Unfortunately this is not set up properly as a separate table of Tags.

2nd EDIT: I'm using MongoDB, so actually it is using a MongoWrapper like MongoMapper or mongoid, not ActiveRecord. This is an error on my part, sorry! Because of this error, it screws up the analysis of this question. Thanks PinnyM for pointing out the error!

Since you are using MongoDB, you could also consider a MongoDB-specific solution (aggregation framework) for the array intersection, so that you could get the database to do all the work before fetching the final result.

See this SO thread How to check if an array field is a part of another array in MongoDB?

Assuming that the entire tags list is stored in a single database field and that you want to keep it that way, I don't see much scope of improvement, since you need to get all the data into Ruby for processing.

However, there is one problem with your database query

Article.where(status: 'Finished')

# This translates into the following query
SELECT * FROM articles WHERE status = 'Finished'

Essentially, you are fetching all the columns whereas you only need the tags column for your process. So, you can use pluck like this:

Article.where(status: 'Finished').pluck(:tags)

# This translates into the following query
SELECT tags FROM articles WHERE status = 'Finished'

I answered a question regarding general intersection like queries in ActiveRecord here .

Extracted below:


The following is a general approach I use for constructing intersection like queries in ActiveRecord:

class Service < ActiveRecord::Base
  belongs_to :person

  def self.with_types(*types)
    where(service_type: types)
  end
end

class City < ActiveRecord::Base
  has_and_belongs_to_many :services
  has_many :people, inverse_of: :city
end

class Person < ActiveRecord::Base
  belongs_to :city, inverse_of: :people

  def self.with_cities(cities)
    where(city_id: cities)
  end

  # intersection like query
  def self.with_all_service_types(*types)
    types.map { |t|
      joins(:services).merge(Service.with_types t).select(:id)
    }.reduce(scoped) { |scope, subquery|
      scope.where(id: subquery)
    }
  end
end

Person.with_all_service_types(1, 2)
Person.with_all_service_types(1, 2).with_cities(City.where(name: 'Gold Coast'))

It will generate SQL of the form:

SELECT "people".*
  FROM "people"
 WHERE "people"."id" in (SELECT "people"."id" FROM ...)
   AND "people"."id" in (SELECT ...)
   AND ...

You can create as many subqueries as required with the above approach based on any conditions/joins etc so long as each subquery returns the id of a matching person in its result set.

Each subquery result set will be AND'ed together thus restricting the matching set to the intersection of all of the subqueries.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM