简体   繁体   中英

How do I compare two hashes containing ~25000 hashes?

I have two hashes containing multiple hashes (product information).

What I want to do is compare the two hashes and see what products got added, deleted, updated (eg price, description, image).

old_hash.size
# => 24595

new_hash.size
# => 26153

Here's what structure of the two hashes look like:

{"wi230075"=>
  {"itemId"=>"wi230075",
   "description"=>"AH Verse frietaardappelen",
   "salesUnitSize"=>"2,5 kg",
   "images"=>[...]
   "fromPrice"=>2.19,
   "basePrice"=>{"price"=>2.19, "unitPriceDescription"=>"0.96/KG"},
   "score"=>0,
   "frozen"=>false,
   "isPBO"=>false,
   "outOfStock"=>false,
   "quantity"=>0,
   "extendedAttributes"=>[],
   "sourceId"=>{"source"=>"wi", "id"=>230075, "asString"=>"wi230075"},
   "hqIdSource"=>"AH_HQ",
   "hqId"=>822729,
   "productId"=>230075,
   "links"=>[],
   "category"=>"/Aardappel, groente, fruit/Aardappelen/Hele aardappel/",
   "brand"=>"AH"},
  {...}
}

I've tried comparing the two hashes using the HashDiff gem . Here's what I get:

diff = HashDiff.diff(old_hash, new_hash)
diff.size
# => 64378

Something seems to be going wrong, there can't be 64378 changes.

What is a better way to compare the two hashes?

Edit:

I'd just like to know if a product got added, deleted or edited. If it did, a simple true would suffice.

This will return all the keys that were changed (ie created, removed or updated):

(old_hash.keys | new_hash.keys).select { |k| old_hash[k] != new_hash[k] }

To get specific you can do something like:

keys = (old_hash.keys | new_hash.keys)
new_keys = keys.select { |k| old_hash[k].nil? }
deleted_keys = keys.select { |k| new_hash[k].nil? }
modified_keys = keys.select { |k| old_hash[k] != new_hash[k] }
unchanged_keys = keys - (new_keys | deleted_keys | modified_keys)

This assumes you're not interested in keys with nil values. If you are then you should obviously replace the .nil? call with something else.

I haven't test the code, but i think it looks like this

To get the added record:

added_keys = new_hash.keys - old_hash.keys
added_records = new_hash.select{|k,v| added_keys.include? k}

To get the removed record:

removed_keys = old_hash.keys - new_hash.keys
removed_records = old_hash.select{|k,v| removed_keys.include? k}

To get the changed record:

changed_records = new_hash.select do |k,v|
  old_hash.has_key?(k) && (old_hash[k]["description"] != new_hash[k]["description"] || old_hash[k]["images"] != new_hash[k]["images"] || old_hash[k]["basePrice"] != new_hash[k]["basePrice"] )
end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM