简体   繁体   English

如何渴望从rake任务调用的方法中加载

[英]How to eager loading in a method called from rake task

I have a task like this: 我有一个像这样的任务:

namespace :company do
  task :update, [:code] => :environment do |t, args|
    company = Company.find_or_create_by(code: args[:code])
    company.update_from_local_data
  end
end

And this is the Company class. 这是Company类。

class Company < ActiveRecord::Base
  has_many :items
  def update_from_local_data
    data = YAML.load(File.read(ENV['COMPANY_DATA_FILE']))
    update_items(data)
  end
  def update_items(item_array)
    item_array.each do |value|
      item = items.find_or_initialize_by(name: value[:name])
      item.update_attributes(value)
    end
  end
end

I confirmed that there is a lot of SELECT SQL query for this code. 我确认此代码有很多SELECT SQL查询。

In controller I can handle with it, but how can I use eager loading from rake task? 在控制器中,我可以处理它,但是如何使用rake任务中的紧急加载?

Edit 编辑

Thanks for Uri's comments I see how to improve performance for save several data to Database, but I still have problem how to call find_or_initialize_by for several items. 感谢您对Uri的评论,我看到了如何提高性能以将多个数据保存到数据库,但是对于多个项目如何调用find_or_initialize_by仍然存在问题。

I found :on_duplicate_key_update option for ActiveRecord.import , but it can be used only with MySQL while I'm using PostgreSQL . 我在ActiveRecord.import找到了:on_duplicate_key_update选项,但是当我使用PostgreSQL时,它只能与MySQL一起使用。

Edit 2 编辑2

To explain what is the problem I created a example project . 为了解释问题所在,我创建了一个示例项目

This is a result of Company#update_from_local_data . 这是Company#update_from_local_data的结果。 I don't want SELECT query for every Item s. 我不想对每个Item SELECT查询。

How can I write it more efficiently? 如何更有效地编写它?

c = Company.first
c.update_from_local_data
  Item Load (0.2ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item0' LIMIT 1  [["company_id", 1]]
   (0.1ms)  begin transaction
   (0.0ms)  commit transaction
  Item Load (0.1ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item1' LIMIT 1  [["company_id", 1]]
   (0.0ms)  begin transaction
   (0.0ms)  commit transaction
  Item Load (0.1ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item2' LIMIT 1  [["company_id", 1]]
   (0.1ms)  begin transaction
   (0.0ms)  commit transaction
  Item Load (0.1ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item3' LIMIT 1  [["company_id", 1]]
   (0.0ms)  begin transaction
   (0.0ms)  commit transaction
  Item Load (0.1ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item4' LIMIT 1  [["company_id", 1]]
   (0.0ms)  begin transaction
   (0.0ms)  commit transaction
  Item Load (0.1ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item5' LIMIT 1  [["company_id", 1]]
   (0.0ms)  begin transaction
   (0.0ms)  commit transaction
  Item Load (0.1ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item6' LIMIT 1  [["company_id", 1]]
   (0.0ms)  begin transaction
   (0.0ms)  commit transaction
  Item Load (0.1ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item7' LIMIT 1  [["company_id", 1]]
   (0.0ms)  begin transaction
   (0.0ms)  commit transaction
  Item Load (0.1ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item8' LIMIT 1  [["company_id", 1]]
   (0.0ms)  begin transaction
   (0.0ms)  commit transaction
  Item Load (0.1ms)  SELECT  "items".* FROM "items"  WHERE "items"."company_id" = ? AND "items"."name" = 'item9' LIMIT 1  [["company_id", 1]]
   (0.0ms)  begin transaction
   (0.0ms)  commit transaction
=> [{:name=>"item0"}, {:name=>"item1"}, {:name=>"item2"}, {:name=>"item3"}, {:name=>"item4"}, {:name=>"item5"}, {:name=>"item6"}, {:name=>"item7"}, {:name=>"item8"}, {:name=>"item9"}]

You said you want to speed up the find_or_initialize_by step. 您说过要加快find_or_initialize_by步骤。

def update_items( items ) 
  # 'items' is an array of attributes hashes

  ActiveRecord::Base.transaction do

    names_array = items.map{ |attributes| attributes[:name] }
    existing_records = Company.where(name: names_array)

    records_by_name = existing_record.each_with_object({}) do |record, hash|
      name = record.name
      hash[name] = record
    end

    items.each do |attributes|
      name = attributes[:name]
      record = records_by_name[name] || Company.new

      # with validations and callbacks:
      #   record.update_attributes(attributes)

      # without validations:
      #   attributes.each{ |k, v| record[k] = v }
      #   record.save(validate: false)


      # without validations or callbacks:
      #   If you're using an older version of Rails,
      #   you can use record.save(:update_without_callbacks)
      #   For recent versions, you'll need to either write SQL-
      #   or disable all callbacks with skip_callbacks and then re-enable-
      #   them with set_callbacks
    end
  end
end

Basically, you find all the existing records in one go rather than executing individual search queries for each name. 基本上,您可以一次性找到所有现有记录,而不是对每个名称执行单独的搜索查询。

  1. You can use update_all , which will return the number of entries updated. 您可以使用update_all ,它将返回更新的条目数。 If 0 entries were updated, then you create the new record 如果更新了0个条目,则创建新记录

     def update_items(item_array) item_array.each do |value| entries_updated = items.where(name: value[:name]).update_all(value) if entries_updated == 0 items.create!(value) end end 

    Notice that create! 注意create! will raise an error if it can't create the record. 如果无法创建记录,将引发错误。 You may wish to use just create and handle validation errors on your own. 您可能希望只使用自己create和处理验证错误。

  2. Another way you can go about it, based on the interface you suggested on chat items = load_all(item_array); items.update_all 根据您在聊天items = load_all(item_array); items.update_all上建议的界面items = load_all(item_array); items.update_all可以采用另一种方式进行处理items = load_all(item_array); items.update_all items = load_all(item_array); items.update_all , is items = load_all(item_array); items.update_all

     def update_items(item_array) grouped = item_array.group_by {|i| i[:name] } items.where(name: grouped.keys).each do |item| data = grouped[item.name] item.assign_attributes(data) item.save! if item.changed? end end 

    That would give you less queries if not all items are changed frequently, but can be slow if Company has thousands of items, but you could break item_array into smaller groups and then perform that. 如果不是所有项目都经常更改,那将给您较少的查询,但是如果Company有数千个项目,则查询可能会很慢,但是您可以将item_array分成较小的组,然后执行。 Notice that there's no way to produce one update statement that will change multiple records based on different criteria. 请注意,无法生成一条将根据不同条件更改多个记录的更新语句。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM