简体   繁体   中英

Merge hashes containing same key & value pair

arr1 = [
  {entity_type: "Mac", entity_ids: [3], cascade_id: 2, location_id: 1},
  {entity_type: "Mac", entity_ids: [2], cascade_id: 2, location_id: 1},
  {entity_type: "Mac", entity_ids: [9], cascade_id: 4, location_id: 1},
  {entity_type: "Mac", entity_ids: [10], cascade_id: 4, location_id: 1}
]

This is the part of data, that I get after some of my logical iterations. My desired output here for this example is

[{entity_type: "Mac", entity_ids: [3,2], cascade_id: 2, location_id: 1}, {entity_type: "Mac", entity_ids: [9,10], cascade_id: 4, location_id: 1}]

I want to know how to merge hashes if it's one or two key-value pair are same, merging other key's values to an array.

-> This is one more instance

arr2 = [
  {entity_type: "Sub", entity_ids: [7], mac_id: 5, cascade_id: 1, location_id: 1},
  {entity_type: "Sub", entity_ids: [10], mac_id: 5, cascade_id: 1, location_id: 1},
  {entity_type: "Sub", entity_ids: [4], mac_id: 2, cascade_id: 1, location_id: 1},
  {entity_type: "Sub", entity_ids: [11], mac_id: 7, cascade_id: 2, location_id: 2}
]

desired output for this instance is

[{entity_type: "Sub", entity_ids: [7, 10], mac_id: 5, cascade_id: 1, location_id: 1}, {entity_type: "Sub", entity_ids: [4], mac_id: 2, cascade_id: 1, location_id: 1}, {entity_type: "Sub", entity_ids: [11], mac_id: 7, cascade_id: 2, location_id: 2}]

This will work:

  def combine(collection)
    return [] if collection.empty?
    grouping_key = collection.first.keys - [:entity_ids]

    grouped_collection = collection.group_by do |element|
      grouping_key.map { |key| [key, element[key]] }.to_h
    end

    grouped_collection.map do |key, elements|
      key.merge(entity_ids: elements.map { |e| e[:entity_ids] }.flatten.uniq)
    end
  end

Here's what's going on:

First we determine a "grouping key" for the collection by sampling the keys of the first element and removing :entity_ids. All other keys combined make up the grouping key on which the combination depends.

The Enumerable#group_by method iterates over a collection and groups it by the grouping key we just constructed.

We then iterate over the grouped collection and merge in a new entity_ids attribute made up of the combined entity ids from each group.

You can compute the desired result as follows.

def doit(arr)
  arr.each_with_object({}) do |g,h|
    h.update(g.reject { |k,_| k==:entity_ids }=>g) do |_,o,n|
      o.merge(entity_ids: o[:entity_ids] + n[:entity_ids])
    end
  end.values
end

doit(arr1)
  #=> [{:entity_type=>"Mac", :entity_ids=>[3, 2], :cascade_id=>2, :location_id=>1},
  #    {:entity_type=>"Mac", :entity_ids=>[9, 10], :cascade_id=>4, :location_id=>1}]
doit(arr2)
  #=> [{:entity_type=>"Sub", :entity_ids=>[7, 10], :mac_id=>5, :cascade_id=>1,
  #     :location_id=>1},
  #    {:entity_type=>"Sub", :entity_ids=>[4], :mac_id=>2, :cascade_id=>1,
  #     :location_id=>1},
  #    {:entity_type=>"Sub", :entity_ids=>[11], :mac_id=>7, :cascade_id=>2,
  #     :location_id=>2}]

This uses the form of Hash#update (aka merge! ) that employs a block to determine the values of keys that are present in both hashes being merged. See the doc for an explanation of the block variables k , o and n .

If doit 's argument is arr1 , the steps are as follows.

arr = arr1
e =  arr.each_with_object({})
  #=> #<Enumerator: [{:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2,
  #                   :location_id=>1},
  #                  {:entity_type=>"Mac", :entity_ids=>[2], :cascade_id=>2,
  #                   :location_id=>1},
  #                  {:entity_type=>"Mac", :entity_ids=>[9], :cascade_id=>4,
  #                   :location_id=>1},
  #                  {:entity_type=>"Mac", :entity_ids=>[10], :cascade_id=>4,
  #                  :location_id=>1}
  #                 ]:each_with_object({})>

The first element of the enumerator is passed to the block and values are assigned to the block variables.

g, h = e.next
  #=> [{:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1}, {}]
g #=> {:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1}
h #=> {}

Compute the (only) key for the hash to be merged with h .

a = g.reject { |k,_| k==:entity_ids }
  #=> {:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}

Perform the update operation.

h.update(a=>g)
  #=> {{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}=>
  #    {:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1}}

This is the new value of h . As h (which was empty) did not have the key

{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}

the block was not used to determine the value of this key in the merged hash.

Now generate the next value of the enumerator e , pass it to the block, assign values to the block variables and perform the block calculation.

g, h = e.next
  #=> [{:entity_type=>"Mac", :entity_ids=>[2], :cascade_id=>2, :location_id=>1},
  #    {{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}=>
  #     {:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1}}]
g #=> {:entity_type=>"Mac", :entity_ids=>[2], :cascade_id=>2, :location_id=>1}
h #=> {{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}=>
  #    {:entity_type=>"Mac", :entity_ids=>[3, 2], :cascade_id=>2, :location_id=>1}}
a = g.reject { |k,_| k==:entity_ids }
  #=> {:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}
h.update(a=>g) do |_,o,n|
  puts "_=#{_}, o=#{o}, n=#{n}"
  o.merge(entity_ids: o[:entity_ids] + n[:entity_ids])
end
  #=> {{:entity_type=>"Mac", :cascade_id=>2, :location_id=>1}=>
  #    {:entity_type=>"Mac", :entity_ids=>[3, 2], :cascade_id=>2, :location_id=>1}}

This is the new value of h . As both g and h have the key a the block is consulted to obtain the value of that key in the merged hash (new h ). The values of that block variables are printed.

_={:entity_type=>"Mac", :cascade_id=>2, :location_id=>1},
o={:entity_type=>"Mac", :entity_ids=>[3], :cascade_id=>2, :location_id=>1},
n={:entity_type=>"Mac", :entity_ids=>[2], :cascade_id=>2, :location_id=>1}

h[:entity_ids] is therefore replaced with

o[:entity_ids] + o[:entity_ids]
  #=> [3] + [2] => [3, 2]

The calculations for the two remaining elements of e are similar, at which time

h #=> {{ :entity_type=>"Mac", :cascade_id=>2, :location_id=>1 }=>
  #      { :entity_type=>"Mac", :entity_ids=>[3, 2], :cascade_id=>2, :location_id=>1 },
  #    { :entity_type=>"Mac", :cascade_id=>4, :location_id=>1 }=>
  #      { :entity_type=>"Mac", :entity_ids=>[9, 10], :cascade_id=>4, :location_id=>1 }}

The final step is to return the values of this hash.

h.values
  #=> <as shown above>

Note that some of the block variables are underscores ( _ ). Though they are valid local variables, they are commonly used to indicate that they are not used in the block calculation. An alternative convention is to have the unused block variable begin with an underscore, such as _key .

There are two separate challanges in your problem.

  1. merging the hashes.
  2. merging only if other values are not matching.

Problem 1:

To get any custom behaviour while merging you can pass a block to merge method. In your case you want to combine arrays for entity ids. This blocks takes key and left and right values. In your scenerio you want to combine arrays if key == :entity_ids.

one_entity.merge(other_entity){ |key, left, right|
  key== :entity_ids ? left + right : left
}

Problem 2:

To merge entities based on whether they have different attributes or same, i am using group_by. This will give me a hash combining entities that can be merged into array that i can map over and merge.

actual.group_by {|x| [x[:entity_type], x[:mac_id], x[:location_id]]}

Combining the two will give me the whole solution which works. You can add more attributes in group_by block if you want.

actual.group_by {|x| [x[:entity_type], x[:mac_id], x[:location_id]]}
      .map{|_, entities| entities.reduce({}) { |result, entity|
        result.merge(entity){|key, left, right|
          key== :entity_ids ? left + right : left
        }
      }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM