简体   繁体   English

如何group_by此哈希数组

[英]How to group_by this array of hashes

I have read data in CSV format from a file into the following array: 我已将CSV格式的数据从文件读取到以下数组中:

arr = [
["company", "location", "region", "service", "price", "duration", "disabled"], 
["Google", "Berlin", "EU", "Design with HTML/CSS", "120", "30", "false"], ["Google", "San Francisco", "US", "Design with HTML/CSS", "120", "30", "false"], 
["Google", "San Francisco", "US", "Restful API design", "1500", "120", "false"],
["IBM", "San Francisco", "US", "Design with HTML/CSS", "120", "30", "true"],
["Google<script>alert('hi')<script>", "Berlin", "EU", "Practical TDD", "300", "60", "false"],
["Œoogle", "San Francisco", "US", "Restful API design", "1500", "120", "false"],
["Apple", "Berlin", "EU", "Practical TDD", "300", "60", "true"],
["Apple", "London", "EU", "Advanced AngularJS", "1200", "180", "false"],
["Apple", "New York", "US", "Restful API design", "1500", "120", "false"]
]

which i want to import in the database. 我想导入数据库中。 Based on the below mentioned associations 基于以下提到的关联

# company.rb
  has_many :regions
  has_many :services

# region.rb
  has_many :branches
  belongs_to :company

# branch.rb
  belongs_to :region
  has_many :services

# service.rb
  belongs_to :company
  belongs_to :branch

May be the below mentioned hash can be used:(not sure. Please suggest a good design if possible) 可以使用下面提到的哈希值:(不确定。请提出一个好的设计方案)

{"Google" : [ 
  :name => "Google",
  :regions_attributes => {
    :name => "US", 
    :locations_attributes => {
      :name => "San Francisco"
    }
  },
  :services_attributes: [{
    :name => "Restful API design",
    ...
  },
  {
    :name => "Design with HTML/CSS",
    ...
  }]
]}

My attempt for this: 我对此的尝试:

companies = []
CSV.foreach(csv_file, headers: true) do |row|
  company = {}
  company[:name]   = row['company']
  company[:regions_attributes] = {}
  company[:regions_attributes][:name] = row['region']
  company[:regions_attributes][:branches_attributes] = {}
  company[:regions_attributes][:branches_attributes][:name] = row['location']
  company[:services_attributes] = {}
  company[:services_attributes][:name] = row['service']
  company[:services_attributes][:price] = row['price']
  company[:services_attributes][:duration] = row['duration']
  company[:services_attributes][:disabled] = row['disabled']
  companies << company
end

companies.uniq! { |c| c.values }
companies = companies.group_by { |c| c[:name] }

It groups by company name. 它按公司名称分组。

I want to group services which are in one region as mentioned in the above example that Sanfrancisco, US has two services under it. 如上例所述,我想将一个地区中的服务分组,即美国旧金山有两个服务。

Update 更新资料

Based on Cary Swoveland's solution, i'm able to modify as per the requirements but the associations are not working as i thought of. 基于Cary Swoveland的解决方案,我可以根据需求进行修改,但关联无法按我的想法工作。

companies = CSV.read(csv_file, headers: true).group_by {|csv| csv["company"]}
final = []
companies.transform_values do |arr1|
  company = Company.new(name: arr1.pluck("company").first.encode(Encoding.find('ASCII'), encoding_options))
  services = arr1.map do |c|
    { name: c['service'], price: c['price'], duration: c['duration'], disabled: c['disabled'] }
  end.uniq
  company.services.build(services)
  regions = arr1.group_by { |csv| csv["region"] }.transform_values do |arr2|
    branches = []
    branches << arr2.pluck('location').uniq.map { |location| { name: location, services_attributes: services } }
    { name: arr2.pluck('region').uniq.first, branches_attributes: branches.flatten }
  end
  company.regions.build(regions.values)
  final << company
end

Company.import(final, recursive: true) #activerecord-import gem

Consider changing the structure of your hash and constructing it with the code below. 考虑更改哈希的结构,并使用下面的代码进行构造。 The file 'tmp.csv' contains the first 20 or so rows of the csv file whose link is given by the OP. 文件'tmp.csv'包含csv文件的前20行左右,其链接由OP给出。 I've included its contents at the end. 我已经在最后包含了它的内容。

require 'csv'

CSV.read('tmp.csv', headers: true).group_by { |csv| csv["company"] }.
    transform_values do |arr1|
      arr1.group_by { |csv| csv["region"] }.
           transform_values do |arr2|
             arr2.group_by { |csv| csv["location"] }.
                  transform_values do |arr2|
                    arr2.map { |csv| csv["service"] }.uniq
                  end
           end
    end

  #=> {"Google"=>{
         "EU"=>{
           "Berlin"=>["Design with HTML/CSS","Advanced AngularJS","Restful API design"],
           "London"=>["Restful API design"]
         },
         "US"=>{
            "San Francisco"=>["Design with HTML/CSS", "Restful API design"]
         }
       },
       "Apple"=>{
         "EU"=>{
           "London"=>["Design with HTML/CSS"],
           "Berlin"=>["Restful API design"]
         },
         "US"=>{
           "San Francisco"=>["Design with HTML/CSS"]
         }
       },
       "IBM"=>{
         "US"=>{
           "San Francisco"=>["Design with HTML/CSS"]
         },
         "EU"=>{
           "Berlin"=>["Restful API design"],
           "London"=>["Restful API design"]
         }
      }
     }

If this hash format is not suitable (but the content is what is needed), it could be easily changed to a different format. 如果此哈希格式不合适(但需要的是内容),则可以轻松将其更改为其他格式。

See the docs for CSV::read , CSV::Row#[] , Enumerable#group_by and Hash#transform_values . 有关CSV :: readCSV :: Row#[]Enumerable#group_byHash#transform_values的信息 ,请参见文档。

I was required to do some pre-processing on the linked csv file. 我被要求对链接的csv文件进行一些预处理。 The problem is that the company names were preceded by a "Byte Order Mark" for a UTF-8 file (Search for "Ok, figured it out" here .) I used the code given [here] by Nathan Long to remove those characters. 问题是,公司名称的前面是UTF-8文件的“字节顺序标记”( 在这里搜索“确定,确定了”。)我使用Nathan Long在此处指定的代码删除了这些字符。 The OP will have to write the CSV files without those marks or strip them off when reading the files. OP将不得不写入没有这些标记的CSV文件,或者在读取文件时将其剥离。

The content of my reduced CSV test file is the following.

arr = ["company,location,region,service,price,duration,disabled\n",
       "Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
       "Apple,London,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
       "Apple,Berlin,EU,Restful API design,1500,120,FALSE\n",
       "IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
       "Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
       "IBM,London,EU,Restful API design,1500,120,TRUE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
       "IBM,London,EU,Restful API design,1500,120,TRUE\n",
       "IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
       "Google,Berlin,EU,Advanced AngularJS,1200,180,FALSE\n",
       "Google,Berlin,EU,Restful API design,1500,120,FALSE\n", 
       "Google,London,EU,Restful API design,1500,120,FALSE\n",
       "Apple,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
       "Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
       "IBM,Berlin,EU,Restful API design,1500,120,TRUE\n"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM