简体   繁体   中英

How to create a Hash from a nested CSV in Ruby?

I have a CSV in the following format:

name,contacts.0.phone_no,contacts.1.phone_no,codes.0,codes.1
YK,1234,4567,AB001,AK002

As you can see, this is a nested structure. The CSV may contain multiple rows. I would like to convert this into an array of hashes like this:

[
  {
    name: 'YK',
    contacts: [
        {
            phone_no: '1234'
        },
        {
            phone_no: '4567'
        }
    ],
    codes: ['AB001', 'AK002']
  }
]

The structure uses numbers in the given format to represent arrays. There can be hashes inside arrays. Is there a simple way to do that in Ruby?

The CSV headers are dynamic. It can change. I will have to create the hash on the fly based on the CSV file.

There is a similar node library called csvtojson to do that for JavaScript.

Just read and parse it line-by-line. The arr variable in the code below will hold an array of Hash that you need

arr = []

File.readlines('README.md').drop(1).each do |line|
  fields = line.split(',').map(&:strip)

  hash = { name: fields[0], contacts: [fields[1], fields[2]], address: [fields[3], fields[4]] }
  arr.push(hash)
end

Let's first construct a CSV file.

str = <<~END
name,contacts.0.phone_no,contacts.1.phone_no,codes.0,IQ,codes.1
YK,1234,4567,AB001,173,AK002
ER,4321,7654,BA001,81,KA002
END

FName = 't.csv'

File.write(FName, str)
  #=> 121

I have constructed a helper method to construct a pattern that will be used to convert each row of the CSV file (following the first, containing the headers) to an element (hash) of the desired array.

require 'csv'

def construct_pattern(csv)
  csv.headers.group_by { |col| col[/[^.]+/] }.
      transform_values do |arr|
        case arr.first.count('.')
        when 0
          arr.first
        when 1
          arr
        else 
          key = arr.first[/(?<=\d\.).*/]
          arr.map { |v| { key=>v } }
        end
      end
end

In the code below, for the example being considered:

construct_pattern(csv)
  #=> {"name"=>"name",
  #    "contacts"=>[{"phone_no"=>"contacts.0.phone_no"},
  #                 {"phone_no"=>"contacts.1.phone_no"}],
  #    "codes"=>["codes.0", "codes.1"],
  #    "IQ"=>"IQ"}

By tacking if pattern.empty? onto the above expression we ensure the pattern is constructed only once.

We may now construct the desired array.

pattern = {}
CSV.foreach(FName, headers: true).map do |csv|
  pattern = construct_pattern(csv) if pattern.empty?
  pattern.each_with_object({}) do |(k,v),h|
    h[k] =
    case v
    when Array
      case v.first
      when Hash
        v.map { |g| g.transform_values { |s| csv[s] } }
      else
        v.map { |s| csv[s] }
      end
    else
      csv[v]
    end
  end
end
  #=> [{"name"=>"YK",
  #     "contacts"=>[{"phone_no"=>"1234"}, {"phone_no"=>"4567"}],
  #     "codes"=>["AB001", "AK002"],
  #     "IQ"=>"173"},
  #    {"name"=>"ER",
  #     "contacts"=>[{"phone_no"=>"4321"}, {"phone_no"=>"7654"}],
  #     "codes"=>["BA001", "KA002"],
  #     "IQ"=>"81"}] 

The CSV methods I've used are documented in CSV . See also Enumerable#group_by and Hash#transform_values .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM