I have a set of data that is an array of hashes, with each hash representing one record of data:
data = [
{
:id => "12345",
:bucket_1_rank => "2",
:bucket_1_count => "12",
:bucket_2_rank => "7",
:bucket_2_count => "25"
},
{
:id => "45678",
:bucket_1_rank => "2",
:bucket_1_count => "15",
:bucket_2_rank => "9",
:bucket_2_count => "68"
},
{
:id => "78901",
:bucket_1_rank => "5",
:bucket_1_count => "36"
}
]
The ranks values are always between 1 and 10.
What I am trying to do is select each of the possible values for the rank fields (the :bucket_1_rank
and :bucket_2_rank
fields) as keys in my final resultset, and the values for each key will be an array of all the values in its associated :bucket_count
field. So, for the data above, the final resulting structure I have in mind is something like:
bucket 1:
{"2" => ["12", "15"], "5" => ["36"]}
bucket 2:
{"7" => ["25"], "9" => ["68"]}
I can do this working under the assumption that the field names stay the same, or through hard coding the field/key names, or just using group_by
for the fields I need, but my problem is that I work with a different data set each month where the rank fields are named slightly differently depending on the project specs, and I want to identify the names for the count and rank fields dynamically as opposed to hard coding the field names.
I wrote two quick helpers get_ranks
and get_buckets
that use regex to return an array of fieldnames that are either ranks or count fields, since these fields will always have the literal string "_rank" or "_count" in their names:
ranks = get_ranks
counts = get_counts
results = Hash.new{|h,k| h[k] = []}
data.each do |i|
ranks.each do |r|
unless i[r].nil?
counts.each do |c|
results[i[r]] << i[c]
end
end
end
end
p results
This seems to be close, but feels awkward, and it seems to me there has to be a better way to iterate through this data set. Since I haven't worked on this project using Ruby I'd use this as an opportunity to improve my understanding iterating through arrays of hashes, populating a hash with arrays as values, etc. Any resources/suggestions would be much appreciated.
You could shorten it to:
result = Hash.new{|h,k| h[k] = Hash.new{|h2,k2| h2[k2] = []}}
data.each do |hsh|
hsh.each do |key, value|
result[$1][value] << hsh["#{$1}_count".to_sym] if key =~ /(.*)_rank$/
end
end
puts result
#=> {"bucket_1"=>{"2"=>["12", "15"], "5"=>["36"]}, "bucket_2"=>{"7"=>["25"], "9"=>["68"]}}
Though this is assuming that :bucket_2_item_count
is actually supposed to be :bucket_2_count
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.