从Ruby中的哈希中绑定数据

Question

I am trying to group users to create scatter plots from their data from a ruby hash that looks like this: 我试图对用户进行分组，以根据其来自红宝石哈希的数据创建散点图，如下所示：

[{"userid"=>"1275", "num"=>"1", "amount"=>"15.00"}, 
 {"userid"=>"1286", "num"=>"3", "amount"=>"26.67"}, .... ]

Basically, the values in num can be integers from 1 to 4, while amount goes up to ~100. 基本上，num中的值可以是1到4的整数，而数量最多可以达到100。 I want to bin two levels deep, first grouping by num, and then each of the 4 new bins should be divided further by amount (0-20, 20-50, 50-80, 80+) for 16 groups total. 我想将两个级别的容器进行分类，首先按编号进行分组，然后再将4个新容器中的每个容器进一步按数量划分（0-20、20-50、50-80、80 +），总共16组。

The end product should be an array of hashes, or an array of arrays, which I could then pass on to my view to plot stuff in d3. 最终产品应该是哈希数组或数组数组，然后我可以将其传递给我的视图以在d3中绘制内容。 I have a functional version, did it using case statements and basic flow control conditioning, but i'd like to do this using the group_by clause to have more elegant/shorter code. 我有一个功能版本，是使用case语句和基本的流控制条件来完成的，但是我想使用group_by子句来做到这一点，以使其代码更优美/更简短。

I don't really understand the documentation on group_by, so any help would be appreciated. 我不太了解group_by上的文档，因此将不胜感激。

EDIT: The output should be something more or less like this 编辑：输出应或多或少像这样

[[{"userid"=>"1", "num"=>"1", "amount"=>"15.00"}
  {"userid"=>"2", "num"=>"1", "amount"=>"19.00"}],
 [{"userid"=>"3", "num"=>"1", "amount"=>"25.00"}
  {"userid"=>"4", "num"=>"1", "amount"=>"30.00"}],
 [{"userid"=>"5", "num"=>"2", "amount"=>"15.00"}]]

Basically an array with 16 sub arrays of the key value pairs. 基本上是一个包含16个键值对子数组的数组。

Answer 1

It looks like you can do this by applying two different group_by operations: 看起来您可以通过应用两个不同的group_by操作来做到这一点：

data = [
  {"userid"=>"1", "num"=>"1", "amount"=>"15.00"},
  {"userid"=>"2", "num"=>"1", "amount"=>"19.00"},
  {"userid"=>"3", "num"=>"1", "amount"=>"25.00"},
  {"userid"=>"4", "num"=>"1", "amount"=>"30.00"},
  {"userid"=>"5", "num"=>"2", "amount"=>"15.00"}
]

# Establish the arbitrary groupings as a set of functions which
# can be evaluated. If these overlap in ranges, the first match
# will be used.
groupings = [
  lambda { |v| v >= 0 && v <= 20 },
  lambda { |v| v > 20 && v <= 50 },
  lambda { |v| v > 50 && v <= 80 },
  lambda { |v| v > 80 }
]

data.group_by do |element|
  # Group by the 'num' key first
  element['num']
end.flat_map do |num, elements|
  # Then group these sets by which of the range buckets
  # they should be sorted into.
  elements.group_by do |element|
    # Create an array that looks like [ false, true, false, ... ]
    # based on the test results, then find the index of the
    # first true entry.
    groupings.map do |fn|
      fn.call(element['amount'].to_f)
    end.index(true)
  end.values
end

# => [[{"userid"=>"1", "num"=>"1", "amount"=>"15.00"}, {"userid"=>"2", "num"=>"1", "amount"=>"19.00"}], [{"userid"=>"3", "num"=>"1", "amount"=>"25.00"}, {"userid"=>"4", "num"=>"1", "amount"=>"30.00"}], [{"userid"=>"5", "num"=>"2", "amount"=>"15.00"}]]

Calling .values on the result of a group_by will give you just the grouped sets, not the keys that indicate which group they are. 在group_by的结果上调用.values只会给您分组的集合，而不是指示它们是哪个分组的键。

Answer 2

Maybe like this? 也许是这样吗？

I'm using the array group_by function but also considering the amount by binning it and put it into the group_by condition 我正在使用数组group_by函数，但也通过将其装箱并将其放入group_by条件来考虑数量

arr = [{"userid"=>"1", "num"=>"1", "amount"=>"15.00"},{"userid"=>"2", "num"=>"1", "amount"=>"19.00"},{"userid"=>"3", "num"=>"1", "amount"=>"25.00"},{"userid"=>"4", "num"=>"1", "amount"=>"30.00"},{"userid"=>"5", "num"=>"2", "amount"=>"15.00"}]

a2 = arr.group_by {|i| ((i['num'].to_i-1) + 4 * bin(i['amount'])) }.values

def bin val
    iVal = val.to_i
    if iVal<=20 then return 0 end
    if iVal<=50 then return 1 end
    if iVal<=80 then return 2 end
    return 3
end

and the result is exactly as you wanted it to be 结果完全是您想要的

[[{"amount"=>"15.00", "num"=>"1", "userid"=>"1"}, {"amount"=>"19.00", "num"=>"1", "userid"=>"2"}], [{"amount"=>"15.00", "num"=>"2", "userid"=>"5"}], [{"amount"=>"25.00", "num"=>"1", "userid"=>"3"}, {"amount"=>"30.00", "num"=>"1", "userid"=>"4"}]]

I'm actually mapping two parameters into one dimensional parameter (hash function) So, the function is actually 我实际上是将两个参数映射到一维参数（哈希函数）中，所以函数实际上是

<max value of num>*<bin according to amount>+<num-1>

if max value of num is 4 so bin 0 will map to 0..3 , bin 1 will map to 4..7 , bin 2 will map to 8..11 and bin 3 will map to 12..15 - See, no overlapping which is important. 如果num的最大值是4，则bin 0将映射到0..3，bin 1将映射到4..7，bin 2将映射到8..11，bin 3将映射到12..15-参见，没有重叠，这很重要。

Answer 3

I figured out a way to do it and added another piece of code to dereference the hash and just return the values for user ids in each group: 我想出了一种方法，并添加了另一段代码以取消引用哈希，仅返回每个组中用户ID的值：

users_by_number = firstMonth.group_by {|i| i["num"]}
users_by_number.each_pair do |key, value|
    users_by_number[key] = value.group_by do |j|
        case 
            when j["amount"].to_f <=20 then :twenty
            when j["amount"].to_f <=50 then :twenty_fifty
            when j["amount"].to_f <=80 then :fifty_eighty
            when j["amount"].to_f > 80 then :eighty_plus                                                    
        end
    end

users_by_number[key].each_pair do |group, users|
users_by_number[key][group] = users.map! {|user| user["userid"].to_i}
    end
end

从Ruby中的哈希中绑定数据

问题描述

3 个解决方案

解决方案1
0 2013-07-25 18:14:36

解决方案2
0 2013-07-25 18:34:11

解决方案3
0 已采纳 2013-07-25 18:44:51

从Ruby中的哈希中绑定数据

问题描述

3 个解决方案

解决方案1 0 2013-07-25 18:14:36

解决方案2 0 2013-07-25 18:34:11

解决方案3 0 已采纳 2013-07-25 18:44:51

解决方案1
0 2013-07-25 18:14:36

解决方案2
0 2013-07-25 18:34:11

解决方案3
0 已采纳 2013-07-25 18:44:51