简体   繁体   中英

Ruby group_by an array of hashes with different keys(keys are not fixed)

I want to group array of hashes present as:

array = [{"value"=>[{"a"=>1},{"b"=>4}]},{"value"=>[{"c"=>4},{"d"=>3},{"a"=>3},{"b"=>54}]}]

to:

grouped_data = {"a"=>[1,3],"b"=>[4,54],"c"=>[4],"d"=>[3]}

I can convert array to array#1 = [{"a"=>1}, {"b"=>4}, {"c"=>4}, {"d"=>3}, {"a"=>3}, {"b"=>54}] using array.map(&:values).flatten . I can convert array#1 to how it was with the hash grouped_data using looping over all the data. but i need a more efficient way like using group_by over dynamic keys(keys are not fixed.)

I know how to group if key is fixed. i need to group_by on dynamically changing keys.

I don't expect to win any readability awards for this one...

array.map(&:values)
     .flatten
     .group_by { |o| o.keys.first }
     .map { |key, v| [key, v.map(&:values).flatten] }
     .to_h
=> {"a"=>[1, 3], "b"=>[4, 54], "c"=>[4], "d"=>[3]}

I put together some rough benchmarks if anyone was curious:

require 'benchmark'

n = 10000
letters = ('a'...'z').to_a
numbers = (0...1000).to_a

built_array = []
n.times do |i|
  values = []
  obj_size = (1...letters.size).to_a.sample
  obj_size.times do |j|
    values << {
      "#{letters.sample}" => numbers.sample
    }
  end
  built_array << { "value" => values }
end

Benchmark.bm(15) do |x|
  x.report("anthony") { anthony(built_array) }
  x.report("eric each") { eric_each(built_array) }
  x.report("eric ewo") { eric_each_with_object(built_array) }
  x.report("eric merge") { eric_merge(built_array) }
  x.report("ed inject") { ed_inject(built_array) }
end

                      user     system      total        real
anthony           0.130000   0.010000   0.140000 (  0.146601)
eric each         0.060000   0.000000   0.060000 (  0.067160)
eric ewo          0.070000   0.000000   0.070000 (  0.076125)
eric merge       25.250000   0.880000  26.130000 ( 28.297592)
ed inject         0.080000   0.010000   0.090000 (  0.111045)

Interesting data structure you have here :D

with each

array = [{ 'value' => [{ 'a' => 1 }, { 'b' => 4 }] }, { 'value' => [{ 'c' => 4 }, { 'd' => 3 }, { 'a' => 3 }, { 'b' => 54 }] }]

grouped_data = Hash.new { |h, k| h[k] = [] }

array.each do |subhash|
  subhash['value'].each do |subsubhash|
    subsubhash.each do |key, value|
      grouped_data[key] << value
    end
  end
end

p grouped_data
#=> {"a"=>[1, 3], "b"=>[4, 54], "c"=>[4], "d"=>[3]}

each_with_object

With your proposed code, you could also write :

grouped_data = Hash.new { |h, k| h[k] = [] }

p array.map(&:values).flatten.each_with_object(grouped_data){|subhash,data| 
  subhash.each do |k,v|
    data[k] << v
  end
}
#=> {"a"=>[1, 3], "b"=>[4, 54], "c"=>[4], "d"=>[3]}

merge

Another option would be with merge :

p array.map(&:values).flatten.inject{|mem,hash| mem.merge(hash){|k,o,n| [o,n].flatten}}
#=> {"a"=>[1, 3], "b"=>[4, 54], "c"=>4, "d"=>3}

Note that the output is different though. If there's only one value for a letter, it's returned as an integer, not as a 1-element array.

I agree with Eric Duminil. Interesting data structure.

With #inject

array = [{"value"=>[{"a"=>1},{"b"=>4}]},{"value"=>[{"c"=>4},{"d"=>3},{"a"=>3},{"b"=>54}]}]

new_hash = array.inject(Hash.new) do |h,o|
  o['value'].each do |sh|
    h[sh.keys[0]] = [] if h[sh.keys[0]].nil?
    h[sh.keys[0]] << sh.values[0] 
  end
  h
end

puts new_hash

This won't beat Eric's answer using #merge in shortness, but will do the thing:

#=> {"a"=>[1, 3], "b"=>[4, 54], "c"=>4, "d"=>3}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM