简体   繁体   English

合并哈希数组:值应为合并值的平均值

[英]Merge array of hashes: value should be the average of merged values

Issue: Merge array of hashes with the same value of specified key and find average for the other keys. 问题:将具有相同指定键值的哈希数组合并,并找到其他键的平均值。

My solution seems to be ugly 我的解决方案似乎很丑

data: 数据:

require 'pp'

arr = [{:red=>346.0,
  :unu=>10.0,
  :used=>20147.0,
  :acc_id=>550,
  :percent=>0.01},
 {:red=>0.0,
  :unu=>1.0,
  :used=>66.0,
  :acc_id=>550,
  :percent=>0.06},
 {:red=>120.0,
  :unu=>11.0,
  :used=>166.0,
  :acc_id=>550,
  :percent=>10.06},
 {:red=>1306.0,
  :unu=>1.0,
  :used=>13259.0,
  :acc_id=>9999,
  :percent=>0.0}]

In current example we should merge 3 hashes with ( :acc_id = 550 ) and result array should contain two hashes ( merged hash with :acc_id = 550 and untouched hash with :acc_id = 9999 ) 在当前示例中,我们应将3个哈希与(:acc_id = 550)合并,并且结果数组应包含两个哈希(与:acc_id = 550合并的哈希和与:acc_id = 9999的原始哈希一起)

algorithm: 算法:

data = []
arr.group_by{|h| h[:acc_id] }.map {|_, arr_of_hashes|
  sz = arr_of_hashes.size
  if sz > 1
    arr_of_hashes = arr_of_hashes.inject{|memo, el|
      memo.merge(el) {|k, old_v, new_v| old_v + new_v}
    }

    arr_of_hashes.map {|k, v| arr_of_hashes[k] = v / sz}
  end
  data << arr_of_hashes if arr_of_hashes.is_a? Hash
  data << arr_of_hashes[0] if arr_of_hashes.is_a? Array
}

pp data

Expected result: Array of merged hashes 预期结果:合并哈希数组

[{:red=>155.33333333333334,
  :unu=>7.333333333333333,
  :used=>6793.0,
  :acc_id=>550,
  :percent=>3.376666666666667},
 {:red=>1306.0,
  :unu=>1.0,
  :used=>13259.0,
  :acc_id=>9999,
  :percent=>0.0}]

... ... ... ……

I found one bug you perform / and + on acc_id Have refactored the code let's try this but I guess we still improve this. 我发现您在acc_id上执行/+一个错误重构了代码,让我们尝试一下,但我想我们仍会对此进行改进。

data = []
arr.group_by{|h| h[:acc_id] }.map {|_, arr_of_hashes|
  sz = arr_of_hashes.size
  result = Hash.new(0)
  arr_of_hashes.map{ |hash| hash.map{ |k,v| result[k] += v/sz unless k == :acc_id } }
  result[:acc_id] = arr_of_hashes.first[:acc_id]
  data << result
}

I suggest you compute the averages as follows. 我建议您按以下方式计算平均值。 I assume that, as in the example, all hashes have the same keys (though they need not necessarily appear in the same order). 我假设像在示例中一样,所有散列都具有相同的键(尽管它们不一定必须以相同的顺序出现)。

Code

def doit(arr, key)
  keys = arr.first.keys
  arr.group_by { |g| g[key] }.
      map do |_,a| 
        averages = a.map { |h| h.values_at(*keys) }.
                     transpose.
                     map { |v| v.sum.fdiv(v.size) }
        keys.zip(averages).to_h
      end
end 

Example

Notice that my array of hashes is somewhat different than that given in example in the question. 请注意,我的哈希数组与问题示例中给出的数组有些不同。 Specifically, there are three (rather than two) groups of hashes for which the value of the key :acc_id has a common value. 具体来说,存在三个(而不是两个)散列组,键:acc_id的值具有相同的值。

arr = [{ red: 346.0,  unu: 10.0, used: 20147.0, acc_id: 550,  percent: 0.01 },
       { red: 0.0,    unu: 1.0,  used: 66.0,    acc_id: 550,  percent: 0.06 },
       { red: 120.0,  unu: 11.0, used: 166.0,   acc_id: 10,   percent: 10.06 },
       { red: 100.0,  unu: 19.0, used: 170.0,   acc_id: 10,   percent: 11.56 },
       { red: 1306.0, unu: 1.0,  used: 13259.0, acc_id: 9999, percent: 0.0 }]

doit(arr, :acc_id)
  #=> [{:red=>173.0, :unu=>5.5, :used=>10106.5, :acc_id=>550.0, :percent=>0.035},
  #    {:red=>110.0, :unu=>15.0, :used=>168.0, :acc_id=>10.0, :percent=>10.81},
  #    {:red=>1306.0, :unu=>1.0, :used=>13259.0, :acc_id=>9999.0, :percent=>0.0}]

Explanation 说明

See Enumerable#group_by and Array#sum (the latter having made its debut in v2.4). 请参阅Enumerable#group_byArray#sum (后者在v2.4中首次亮相)。

The steps are as follows. 步骤如下。

key = :acc_id
keys = arr.first.keys
  #=> [:red, :unu, :used, :acc_id, :percent]
b = arr.group_by { |g| g[key] }
  #=> { 550=>[{:red=>346.0, :unu=>10.0, :used=>20147.0, :acc_id=>550, :percent=>0.01},
  #           {:red=>0.0, :unu=>1.0, :used=>66.0, :acc_id=>550, :percent=>0.06}],
  #      10=>[{:red=>120.0, :unu=>11.0, :used=>166.0, :acc_id=>10, :percent=>10.06},
  #           {:red=>100.0, :unu=>19.0, :used=>170.0, :acc_id=>10, :percent=>11.56}],
  #    9999=>[{:red=>1306.0, :unu=>1.0, :used=>13259.0, :acc_id=>9999, :percent=>0.0}]}
b.map do |_,a| 
  averages = a.map { |h| h.values_at(*keys) }.
               transpose.
               map { |v| v.sum.fdiv(v.size) }
  keys.zip(averages).to_h
end 
  #=> <the array of hashes shown above>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM