[英]How to eager loading in a method called from rake task
I have a task like this: 我有一个像这样的任务:
namespace :company do
task :update, [:code] => :environment do |t, args|
company = Company.find_or_create_by(code: args[:code])
company.update_from_local_data
end
end
And this is the Company
class. 这是
Company
类。
class Company < ActiveRecord::Base
has_many :items
def update_from_local_data
data = YAML.load(File.read(ENV['COMPANY_DATA_FILE']))
update_items(data)
end
def update_items(item_array)
item_array.each do |value|
item = items.find_or_initialize_by(name: value[:name])
item.update_attributes(value)
end
end
end
I confirmed that there is a lot of SELECT
SQL query for this code. 我确认此代码有很多
SELECT
SQL查询。
In controller I can handle with it, but how can I use eager loading from rake task? 在控制器中,我可以处理它,但是如何使用rake任务中的紧急加载?
Thanks for Uri's comments I see how to improve performance for save several data to Database, but I still have problem how to call find_or_initialize_by
for several items. 感谢您对Uri的评论,我看到了如何提高性能以将多个数据保存到数据库,但是对于多个项目如何调用
find_or_initialize_by
仍然存在问题。
I found :on_duplicate_key_update
option for ActiveRecord.import
, but it can be used only with MySQL
while I'm using PostgreSQL
. 我在
ActiveRecord.import
找到了:on_duplicate_key_update
选项,但是当我使用PostgreSQL
时,它只能与MySQL
一起使用。
To explain what is the problem I created a example project . 为了解释问题所在,我创建了一个示例项目 。
This is a result of Company#update_from_local_data
. 这是
Company#update_from_local_data
的结果。 I don't want SELECT
query for every Item
s. 我不想对每个
Item
SELECT
查询。
How can I write it more efficiently? 如何更有效地编写它?
c = Company.first
c.update_from_local_data
Item Load (0.2ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item0' LIMIT 1 [["company_id", 1]]
(0.1ms) begin transaction
(0.0ms) commit transaction
Item Load (0.1ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item1' LIMIT 1 [["company_id", 1]]
(0.0ms) begin transaction
(0.0ms) commit transaction
Item Load (0.1ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item2' LIMIT 1 [["company_id", 1]]
(0.1ms) begin transaction
(0.0ms) commit transaction
Item Load (0.1ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item3' LIMIT 1 [["company_id", 1]]
(0.0ms) begin transaction
(0.0ms) commit transaction
Item Load (0.1ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item4' LIMIT 1 [["company_id", 1]]
(0.0ms) begin transaction
(0.0ms) commit transaction
Item Load (0.1ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item5' LIMIT 1 [["company_id", 1]]
(0.0ms) begin transaction
(0.0ms) commit transaction
Item Load (0.1ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item6' LIMIT 1 [["company_id", 1]]
(0.0ms) begin transaction
(0.0ms) commit transaction
Item Load (0.1ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item7' LIMIT 1 [["company_id", 1]]
(0.0ms) begin transaction
(0.0ms) commit transaction
Item Load (0.1ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item8' LIMIT 1 [["company_id", 1]]
(0.0ms) begin transaction
(0.0ms) commit transaction
Item Load (0.1ms) SELECT "items".* FROM "items" WHERE "items"."company_id" = ? AND "items"."name" = 'item9' LIMIT 1 [["company_id", 1]]
(0.0ms) begin transaction
(0.0ms) commit transaction
=> [{:name=>"item0"}, {:name=>"item1"}, {:name=>"item2"}, {:name=>"item3"}, {:name=>"item4"}, {:name=>"item5"}, {:name=>"item6"}, {:name=>"item7"}, {:name=>"item8"}, {:name=>"item9"}]
You said you want to speed up the find_or_initialize_by step. 您说过要加快find_or_initialize_by步骤。
def update_items( items )
# 'items' is an array of attributes hashes
ActiveRecord::Base.transaction do
names_array = items.map{ |attributes| attributes[:name] }
existing_records = Company.where(name: names_array)
records_by_name = existing_record.each_with_object({}) do |record, hash|
name = record.name
hash[name] = record
end
items.each do |attributes|
name = attributes[:name]
record = records_by_name[name] || Company.new
# with validations and callbacks:
# record.update_attributes(attributes)
# without validations:
# attributes.each{ |k, v| record[k] = v }
# record.save(validate: false)
# without validations or callbacks:
# If you're using an older version of Rails,
# you can use record.save(:update_without_callbacks)
# For recent versions, you'll need to either write SQL-
# or disable all callbacks with skip_callbacks and then re-enable-
# them with set_callbacks
end
end
end
Basically, you find all the existing records in one go rather than executing individual search queries for each name. 基本上,您可以一次性找到所有现有记录,而不是对每个名称执行单独的搜索查询。
You can use update_all , which will return the number of entries updated. 您可以使用update_all ,它将返回更新的条目数。 If 0 entries were updated, then you create the new record
如果更新了0个条目,则创建新记录
def update_items(item_array) item_array.each do |value| entries_updated = items.where(name: value[:name]).update_all(value) if entries_updated == 0 items.create!(value) end end
Notice that create!
注意
create!
will raise an error if it can't create the record. 如果无法创建记录,将引发错误。 You may wish to use just
create
and handle validation errors on your own. 您可能希望只使用自己
create
和处理验证错误。
Another way you can go about it, based on the interface you suggested on chat items = load_all(item_array); items.update_all
根据您在聊天
items = load_all(item_array); items.update_all
上建议的界面items = load_all(item_array); items.update_all
可以采用另一种方式进行处理items = load_all(item_array); items.update_all
items = load_all(item_array); items.update_all
, is items = load_all(item_array); items.update_all
是
def update_items(item_array) grouped = item_array.group_by {|i| i[:name] } items.where(name: grouped.keys).each do |item| data = grouped[item.name] item.assign_attributes(data) item.save! if item.changed? end end
That would give you less queries if not all items are changed frequently, but can be slow if Company
has thousands of items, but you could break item_array
into smaller groups and then perform that. 如果不是所有项目都经常更改,那将给您较少的查询,但是如果
Company
有数千个项目,则查询可能会很慢,但是您可以将item_array
分成较小的组,然后执行。 Notice that there's no way to produce one update statement that will change multiple records based on different criteria. 请注意,无法生成一条将根据不同条件更改多个记录的更新语句。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.