简体   繁体   中英

Extra nested second level hash of arrays in Ruby

I have this input:

Us,1,1,F
Us,1,2,O
Us,2,1,N 
Pa,1,1,S
Pa,1,3, D
Pa,1,5,H
Pa,4,7,K

I'm trying to get a hash of arrays (which, in turn, are hashes of arrays). I'd like to get this hash:

b = {
  "Us" => [
    {"1" => [["1", "F"], ["2", "O"]]},
    {"2" => [["1", "N"]]}
  ],
  "Pa" => [
    {"1" => [["1", "S"], ["3", "D"], ["5", "H"]]},
    {"4" => [["7", "K"]]}
  ]
}

This is my code:

a = Hash.new{|hsh, key| hsh[key] = []}
b = Hash.new{|hsh, key| hsh[key] = []}
File.readlines('file.txt').each do |line|
  r = line.split(",")
  a[r[0] + "°" + r[1]].push [r[2], r[3].strip] # I load hash "a" here
end

a.map{|k, v|
  m=k.split("°")
  b[m[0]].push [m[1]=> v] # I load hash "b" here
}

The keys of the hash are the unique combinations of values in Column1 and Column2 (Col1 ° Col2), and the values are the relations between Col2 (key of 2nd level hash), Col3, and Col4 (These two as elements of the internal arrays).

I almost getting the result, but there is an extra nesting. I get this result:

b = {
  "Us"=>[
    [{"1"=>[["1", "F"], ["2", "O"]]}],
    [{"2"=>[["1", "N"]]}]
  ],
  "Pa"=>[
    [{"1"=>[["1", "S"], ["3", "D"], ["5", "H"]]}],
    [{"4"=>[["7", "K"]]}]
  ]
}

Please give me some help.

UPDATE

Modification to a shorter code from Cary's suggestion.

a = Hash.new{|hsh, key| hsh[key] = []}
b = Hash.new{|hsh, key| hsh[key] = []}

File.readlines('input').each do |line|
  r = line.chomp.split(",")
  a[[r[0], r[1]]].push [r[2], r[3]]
end

a.each{|k, v|
  b[k[0]].concat [k[1] => v]    
}

UPDATE2

Even after Cary's help I was able to get my final output, I show below why I was trying to get the hash of arrays and inside the arrays another hashes of arrays.

This is the output. Is like organize a book index showing the sections ("Us" and "Pa"), then showing the chapters of each section ( 1 and 2 for "Us" and 1 and 4 for "Pa"). Then for each chapter show each article and it's related description, example article "3" has the description "D", so "D" is printed next to "3" and article "3" belongs to chapter "1" that belongs to section "Pa".

 Us 
    ......1
    ..............1.......F
    ..............2.......O
    ......2 
    ..............1.......N
   Pa
    ......1
    ..............1.......S
    ..............3.......D
    ..............5.......H
    ......4
    ..............7.......K

Thanks for great help!

You can fix your code by replacing

b[m[0]].push [m[1]=>v]

with

b[m[0]] += [m[1]=> v]

or

b[m[0]].concat [m[1]=> v]

As you know, it is the value of b after the code has been executed that you want, so you should add b as a final line.

A few other observations:

  • if r = line.split(",") is changed to r = line.chomp.split(",") the following line is simplified.
  • a.map { |k,v|... can be replaced with a.each { |k,v|... , which is more appropriate and reads better.
  • a[r[0] + "°" + r[1]]... made my eyes hurt. You never need to resort to such hacks. You could have instead written a[r[0], r[1]]... , deleted m=k.split("°") and replaced the next line with b[k[0]] += [k[1]=> v] .

Here are two other ways you could do that. Both approaches use the method Hash#transform_values , which made its debut in Ruby v2.4.

str =<<_
Us,1,1,F
Us,1,2,O
Us,2,1,N 
Pa,1,1,S
Pa,1,3,D
Pa,1,5,H
Pa,4,7,K
_

Use Enumerable#group_by

str.lines.
    map { |line| line.chomp.split(',') }.
    group_by(&:shift).
    transform_values { |arr| arr.group_by(&:shift).map { |k,v| { k=>v } } }
  #=> {"Us"=>[{"1"=>[["1", "F"], ["2", "O"]]}, {"2"=>[["1", "N "]]}],
  #    "Pa"=>[{"1"=>[["1", "S"], ["3", " D"], ["5", "H"]]}, {"4"=>[["7", "K"]]}]}

The steps are as follows.

a = str.lines
  #=> ["Us,1,1,F\n", "Us,1,2,O\n", "Us,2,1,N \n",
  #    "Pa,1,1,S\n", "Pa,1,3, D\n", "Pa,1,5,H\n", "Pa,4,7,K\n"]
b = a.map { |line| line.chomp.split(',') }
  #=> [["Us", "1", "1", "F"], ["Us", "1", "2", "O"], ["Us", "2", "1", "N "],
  #    ["Pa", "1", "1", "S"], ["Pa", "1", "3", " D"], ["Pa", "1", "5", "H"],
  #    ["Pa", "4", "7", "K"]]
c = b.group_by(&:shift)
  #=> {"Us"=>[["1", "1", "F"], ["1", "2", "O"], ["2", "1", "N "]],
  #    "Pa"=>[["1", "1", "S"], ["1", "3", " D"], ["1", "5", "H"],
  #           ["4", "7", "K"]]}
c.transform_values { |arr| arr.group_by(&:shift).map { |k,v| { k=>v } } }
  #=> <the return value shown above>

When executing the last expression the first value passed to the block and assigned to the block variable is:

arr = [["1", "1", "F"], ["1", "2", "O"], ["2", "1", "N "]]

The block calculation then returns:

d = arr.group_by(&:shift)
  #=> {"1"=>[["1", "F"], ["2", "O"]], "2"=>[["1", "N "]]}
d.map { |k,v| { k=>v } }
  #=> [{"1"=>[["1", "F"], ["2", "O"]]}, {"2"=>[["1", "N "]]}]

Use Hash#update

This uses the form of Hash#update (aka Hash#merge! ) that employs a block to determine the values of keys that are present in both hashes being merged. This form of update is used in two nesting levels.

str.lines.each_with_object({}) do |line, h|
  s0, s1, s2, s3 = line.chomp.split(',')
  h.update(s0=>{ s1=>[[s2, s3]] }) do |_0,oh0,nh0|
    oh0.merge(nh0) { |_1,oh1,nh1| oh1+nh1 }
  end
end.transform_values { |h| h.map { |k,v| { k=>v } } }
  #=> <the return value shown above>

Note the code preceding transform_values returns the following.

{"Us"=>{"1"=>[["1", "F"], ["2", "O"]], "2"=>[["1", "N"]]},
 "Pa"=>{"1"=>[["1", "S"], ["3", " D"], ["5", "H"]], "4"=>[["7", "K"]]}}

A variant of this method is the following.

str.lines.each_with_object({}) do |line, h|
  s1, s2, s3, s4 = line.chomp.split(',')
  h.update(s1=>{ s2=>{ s2=>[[s3, s4]] } }) do |_0,oh0,nh0|
    oh0.merge(nh0) do |_1,oh1,nh1|
      oh1.merge(nh1) { |_2,oh2,nh2| oh2+nh2  }
    end
  end
end.transform_values(&:values)
  #=> <the return value shown above>

Note the code preceding transform_values returns the following.

h = {"Us"=>{"1"=>{"1"=>[["1", "F"], ["2", "O"]]}, "2"=>{"2"=>[["1", "N "]]}},
     "Pa"=>{"1"=>{"1"=>[["1", "S"], ["3", " D"], ["5", "H"]]}, "4"=>{"4"=>[["7", "K"]]}}}

transform_values(&:values) converts the values of "Us" and "Pa" (which are hashes) to arrays of the values of those hashes (which are also hashes), namely,

[{"1"=>[["1", "F"], ["2", "O"]]}, {"2"=>[["1", "N "]]}]

for the key "Us" and

[{"1"=>[["1", "S"], ["3", " D"], ["5", "H"]]}, {"4"=>[["7", "K"]]}]

for "Pa" . It is because we want the values of "Us" and "Pa" to be arrays of hashes that we need the somewhat odd expression

s1=>{ s2=>{ s2=>[[s3, s4]] } }

Had we wanted to the values of "Us" and "Pa" to be a single hash we could have written

s1=>{ s2=>[[s3, s4]] }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM