简体   繁体   中英

Dynamically create hash from array of arrays

I want to dynamically create a Hash without overwriting keys from an array of arrays. Each array has a string that contains the nested key that should be created. However, I am running into the issue where I am overwriting keys and thus only the last key is there

data = {}

values = [
  ["income:concessions",  0, "noi", "722300",  "purpose", "refinancing"],
  ["fees:fee-one", "0" ,"income:gross-income", "900000", "expenses:admin", "7500"],
  ["fees:fee-two", "0", "address:zip", "10019", "expenses:other", "0"]
]

What it should look like:

{ 
  "income" => {
    "concessions" => 0,
    "gross-income" => "900000"
    },
    "expenses" => {
      "admin" => "7500",
      "other" => "0"
    } 
  "noi" => "722300", 
  "purpose" => "refinancing", 
  "fees" => {
    "fee-one" => 0,
    "fee-two" => 0
    },

  "address" => {
    "zip" => "10019"
  }
}

This is the code that I currently, have how can I avoid overwriting keys when I merge?

values.each do |row|
  Hash[*row].each do |key, value|
    keys = key.split(':')

    if !data.dig(*keys)
      hh = keys.reverse.inject(value) { |a, n| { n => a } }
      a = data.merge!(hh)
    end

  end
end

The code you've provided can be modified to merge hashes on conflict instead of overwriting:

values.each do |row|
  Hash[*row].each do |key, value|
    keys = key.split(':')

    if !data.dig(*keys)
      hh = keys.reverse.inject(value) { |a, n| { n => a } }
      data.merge!(hh) { |_, old, new| old.merge(new) }
    end
  end
end

But this code only works for the two levels of nesting.

By the way, I noted ruby-on-rails tag on the question. There's deep_merge method that can fix the problem:

values.each do |row|
  Hash[*row].each do |key, value|
    keys = key.split(':')

    if !data.dig(*keys)
      hh = keys.reverse.inject(value) { |a, n| { n => a } }
      data.deep_merge!(hh)
    end
  end
end
values.flatten.each_slice(2).with_object({}) do |(f,v),h|
  k,e = f.is_a?(String) ? f.split(':') : [f,nil]
  h[k] = e.nil? ? v : (h[k] || {}).merge(e=>v)
end 
  #=> {"income"=>{"concessions"=>0, "gross-income"=>"900000"},
  #    "noi"=>"722300",
  #    "purpose"=>"refinancing",
  #    "fees"=>{"fee-one"=>"0", "fee-two"=>"0"},
  #    "expenses"=>{"admin"=>"7500", "other"=>"0"},
  #    "address"=>{"zip"=>"10019"}}

The steps are as follows.

values = [
  ["income:concessions",  0, "noi", "722300",  "purpose", "refinancing"],
  ["fees:fee-one", "0" ,"income:gross-income", "900000", "expenses:admin", "7500"],
  ["fees:fee-two", "0", "address:zip", "10019", "expenses:other", "0"]
]

a = values.flatten
  #=> ["income:concessions", 0, "noi", "722300", "purpose", "refinancing",
  #    "fees:fee-one", "0", "income:gross-income", "900000", "expenses:admin", "7500",
  #    "fees:fee-two", "0", "address:zip", "10019", "expenses:other", "0"]
enum1 = a.each_slice(2)
  #=> #<Enumerator: ["income:concessions", 0, "noi", "722300",
  #    "purpose", "refinancing", "fees:fee-one", "0", "income:gross-income", "900000",
  #    "expenses:admin", "7500", "fees:fee-two", "0", "address:zip", "10019",
  # "expenses:other","0"]:each_slice(2)>

We can see what values this enumerator will generate by converting it to an array.

enum1.to_a
  #=> [["income:concessions", 0], ["noi", "722300"], ["purpose", "refinancing"],
  #    ["fees:fee-one", "0"], ["income:gross-income", "900000"],
  #    ["expenses:admin", "7500"], ["fees:fee-two", "0"],
  #    ["address:zip", "10019"], ["expenses:other", "0"]]

Continuing,

enum2 = enum1.with_object({})
  #=> #<Enumerator: #<Enumerator:
  #     ["income:concessions", 0, "noi", "722300", "purpose", "refinancing",
  #      "fees:fee-one", "0", "income:gross-income", "900000", "expenses:admin", "7500",
  #      "fees:fee-two", "0", "address:zip", "10019", "expenses:other", "0"]
  #      :each_slice(2)>:with_object({})>
enum2.to_a
  #=> [[["income:concessions", 0], {}], [["noi", "722300"], {}],
  #    [["purpose", "refinancing"], {}], [["fees:fee-one", "0"], {}],
  #    [["income:gross-income", "900000"], {}], [["expenses:admin", "7500"], {}],
  #    [["fees:fee-two", "0"], {}], [["address:zip", "10019"], {}],
  #    [["expenses:other", "0"], {}]]

enum2 can be thought of as a compound enumerator (though Ruby has no such concept). The hash being generated is initially empty, as shown, but will be filled in as additional elements are generated by enum2

The first value is generated by enum2 and passed to the block, and the block values are assigned values by a process called array decomposition .

(f,v),h = enum2.next
  #=> [["income:concessions", 0], {}]
f #=> "income:concessions"
v #=> 0
h #=> {}

We now perform the block calculation.

f.is_a?(String)
  #=> true
k,e = f.is_a?(String) ? f.split(':') : [f,nil]
  #=> ["income", "concessions"]
e.nil?
  #=> false
h[k] = e.nil? ? v : (h[k] || {}).merge(e=>v)
  #=> {"concessions"=>0}

h[k] equals nil if h does not have a key k . In that case (h[k] || {}) #=> {} . If h does have a key k (and h[k] in not nil ). (h[k] || {}) #=> h[k] .

A second value is now generated by enum2 and passed to the block.

(f,v),h = enum2.next
  #=> [["noi", "722300"], {"income"=>{"concessions"=>0}}]
f #=> "noi"
v #=> "722300"
h #=> {"income"=>{"concessions"=>0}}

Notice that the hash, h , has been updated. Recall it will be returned by the block after all elements of enum2 have been generated. We now perform the block calculation.

f.is_a?(String)
  #=> true
k,e = f.is_a?(String) ? f.split(':') : [f,nil]
  #=> ["noi"]
e #=> nil
e.nil?
  #=> true
h[k] = e.nil? ? v : (h[k] || {}).merge(e=>v)
  #=> "722300"
h #=> {"income"=>{"concessions"=>0}, "noi"=>"722300"}

The remaining calculations are similar.

merge overwrites a duplicate key by default.

{ "income"=> { "concessions" => 0 } }.merge({ "income"=> { "gross-income" => "900000" } } completely overwrites the original value of "income" . What you want is a recursive merge, where instead of just merging the top level hash you're merging the nested values when there's duplication.

merge takes a block where you can specify what to do in the event of duplication. From the documentation:

merge!(other_hash){|key, oldval, newval| block} → hsh

Adds the contents of other_hash to hsh. If no block is specified, entries with duplicate keys are overwritten with the values from other_hash, otherwise the value of each duplicate key is determined by calling the block with the key, its value in hsh and its value in other_hash

Using this you can define a simple recursive_merge in one line

def recursive_merge!(hash, other)
  hash.merge!(other) { |_key, old_val, new_val| recursive_merge!(old_val, new_val) }
end

values.each do |row|
  Hash[*row].each do |key, value|
    keys = key.split(':')
  
    if !data.dig(*keys)
      hh = keys.reverse.inject(value) { |a, n| { n => a } }
      a = recursive_merge!(data, hh)
    end
  end
end

A few more lines will give you a more robust solution, that will overwrite duplicate keys that are not hashes and even take a block just like merge

def recursive_merge!(hash, other, &block)
  hash.merge!(other) do |_key, old_val, new_val|
    if [old_val, new_val].all? { |v| v.is_a?(Hash) }
      recursive_merge!(old_val, new_val, &block)
    elsif block_given?
      block.call(_key, old_val, new_val)
    else
      new_val
    end
  end
end

h1 = { a: true, b: { c: [1, 2, 3] } }
h2 = { a: false,  b: { x: [3, 4, 5] } }
recursive_merge!(h1, h2) { |_k, o, _n| o } # => { a: true,  b: { c: [1, 2, 3],  x: [3, 4, 5] } }

Note: This method reproduces the results you would get from ActiveSupport's Hash#deep_merge if you're using Rails.

This is how I would handle this:

def new_h 
  Hash.new{|h,k| h[k] = new_h}
end 

values.flatten.each_slice(2).each_with_object(new_h) do |(k,v),obj|
  keys =  k.is_a?(String) ? k.split(':') : [k]
  if keys.count > 1
    set_key = keys.pop
    obj.merge!(keys.inject(new_h) {|memo,k1| memo[k1] = new_h})
      .dig(*keys)
      .merge!({set_key => v})
  else
    obj[k] = v
  end
end
#=> {"income"=>{
       "concessions"=>0, 
       "gross-income"=>"900000"}, 
    "noi"=>"722300", 
    "purpose"=>"refinancing", 
    "fees"=>{
        "fee-one"=>"0", 
        "fee-two"=>"0"}, 
    "expenses"=>{
        "admin"=>"7500", 
        "other"=>"0"}, 
    "address"=>{
        "zip"=>"10019"}
    }

Explanation:

  • Define a method ( new_h ) for setting up a new Hash with default new_h at any level ( Hash.new{|h,k| h[k] = new_h} )
  • First flatten the Array ( values.flatten )
  • then group each 2 elements together as sudo key value pairs ( .each_slice(2) )
  • then iterate over the pairs using an accumulator where each new element added is defaulted to a Hash ( .each_with_object(new_h.call) do |(k,v),obj| )
  • split the sudo key on a colon ( keys = k.is_a?(String) ? k.split(':') : [k] )
  • if there is a split then create the parent key(s) ( obj.merge!(keys.inject(new_h.call) {|memo,k1| memo[k1] = new_h.call}) )
  • merge the last child key equal to the value ( obj.dig(*keys.merge!({set_key => v}) )
  • other wise set the single key equal to the value ( obj[k] = v )

This has infinite depth as long as the depth chain is not broken say [["income:concessions:other",12],["income:concessions", 0]] in this case the latter value will take precedence ( Note: this applies to all the answers in one way or anther eg the accepted answer the former wins but a value is still lost dues to inaccurate data structure)

repl.it Example

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM