[英]How to group_by this array of hashes
我已将CSV格式的数据从文件读取到以下数组中:
arr = [
["company", "location", "region", "service", "price", "duration", "disabled"],
["Google", "Berlin", "EU", "Design with HTML/CSS", "120", "30", "false"], ["Google", "San Francisco", "US", "Design with HTML/CSS", "120", "30", "false"],
["Google", "San Francisco", "US", "Restful API design", "1500", "120", "false"],
["IBM", "San Francisco", "US", "Design with HTML/CSS", "120", "30", "true"],
["Google<script>alert('hi')<script>", "Berlin", "EU", "Practical TDD", "300", "60", "false"],
["Œoogle", "San Francisco", "US", "Restful API design", "1500", "120", "false"],
["Apple", "Berlin", "EU", "Practical TDD", "300", "60", "true"],
["Apple", "London", "EU", "Advanced AngularJS", "1200", "180", "false"],
["Apple", "New York", "US", "Restful API design", "1500", "120", "false"]
]
我想导入数据库中。 基于以下提到的关联
# company.rb
has_many :regions
has_many :services
# region.rb
has_many :branches
belongs_to :company
# branch.rb
belongs_to :region
has_many :services
# service.rb
belongs_to :company
belongs_to :branch
可以使用下面提到的哈希值:(不确定。请提出一个好的设计方案)
{"Google" : [
:name => "Google",
:regions_attributes => {
:name => "US",
:locations_attributes => {
:name => "San Francisco"
}
},
:services_attributes: [{
:name => "Restful API design",
...
},
{
:name => "Design with HTML/CSS",
...
}]
]}
我对此的尝试:
companies = []
CSV.foreach(csv_file, headers: true) do |row|
company = {}
company[:name] = row['company']
company[:regions_attributes] = {}
company[:regions_attributes][:name] = row['region']
company[:regions_attributes][:branches_attributes] = {}
company[:regions_attributes][:branches_attributes][:name] = row['location']
company[:services_attributes] = {}
company[:services_attributes][:name] = row['service']
company[:services_attributes][:price] = row['price']
company[:services_attributes][:duration] = row['duration']
company[:services_attributes][:disabled] = row['disabled']
companies << company
end
companies.uniq! { |c| c.values }
companies = companies.group_by { |c| c[:name] }
它按公司名称分组。
如上例所述,我想将一个地区中的服务分组,即美国旧金山有两个服务。
更新资料
基于Cary Swoveland的解决方案,我可以根据需求进行修改,但关联无法按我的想法工作。
companies = CSV.read(csv_file, headers: true).group_by {|csv| csv["company"]}
final = []
companies.transform_values do |arr1|
company = Company.new(name: arr1.pluck("company").first.encode(Encoding.find('ASCII'), encoding_options))
services = arr1.map do |c|
{ name: c['service'], price: c['price'], duration: c['duration'], disabled: c['disabled'] }
end.uniq
company.services.build(services)
regions = arr1.group_by { |csv| csv["region"] }.transform_values do |arr2|
branches = []
branches << arr2.pluck('location').uniq.map { |location| { name: location, services_attributes: services } }
{ name: arr2.pluck('region').uniq.first, branches_attributes: branches.flatten }
end
company.regions.build(regions.values)
final << company
end
Company.import(final, recursive: true) #activerecord-import gem
考虑更改哈希的结构,并使用下面的代码进行构造。 文件'tmp.csv'
包含csv文件的前20行左右,其链接由OP给出。 我已经在最后包含了它的内容。
require 'csv'
CSV.read('tmp.csv', headers: true).group_by { |csv| csv["company"] }.
transform_values do |arr1|
arr1.group_by { |csv| csv["region"] }.
transform_values do |arr2|
arr2.group_by { |csv| csv["location"] }.
transform_values do |arr2|
arr2.map { |csv| csv["service"] }.uniq
end
end
end
#=> {"Google"=>{
"EU"=>{
"Berlin"=>["Design with HTML/CSS","Advanced AngularJS","Restful API design"],
"London"=>["Restful API design"]
},
"US"=>{
"San Francisco"=>["Design with HTML/CSS", "Restful API design"]
}
},
"Apple"=>{
"EU"=>{
"London"=>["Design with HTML/CSS"],
"Berlin"=>["Restful API design"]
},
"US"=>{
"San Francisco"=>["Design with HTML/CSS"]
}
},
"IBM"=>{
"US"=>{
"San Francisco"=>["Design with HTML/CSS"]
},
"EU"=>{
"Berlin"=>["Restful API design"],
"London"=>["Restful API design"]
}
}
}
如果此哈希格式不合适(但需要的是内容),则可以轻松将其更改为其他格式。
有关CSV :: read , CSV :: Row#[] , Enumerable#group_by和Hash#transform_values的信息 ,请参见文档。
我被要求对链接的csv文件进行一些预处理。 问题是,公司名称的前面是UTF-8文件的“字节顺序标记”( 在这里搜索“确定,确定了”。)我使用Nathan Long在此处指定的代码删除了这些字符。 OP将不得不写入没有这些标记的CSV文件,或者在读取文件时将其剥离。
The content of my reduced CSV test file is the following.
arr = ["company,location,region,service,price,duration,disabled\n",
"Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
"Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
"Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
"Apple,London,EU,Design with HTML/CSS,120,30,FALSE\n",
"Google,Berlin,EU,Design with HTML/CSS,120,30,FALSE\n",
"Apple,Berlin,EU,Restful API design,1500,120,FALSE\n",
"IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
"Google,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
"IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
"IBM,London,EU,Restful API design,1500,120,TRUE\n",
"IBM,Berlin,EU,Restful API design,1500,120,TRUE\n",
"IBM,London,EU,Restful API design,1500,120,TRUE\n",
"IBM,San Francisco,US,Design with HTML/CSS,120,30,TRUE\n",
"Google,Berlin,EU,Advanced AngularJS,1200,180,FALSE\n",
"Google,Berlin,EU,Restful API design,1500,120,FALSE\n",
"Google,London,EU,Restful API design,1500,120,FALSE\n",
"Apple,San Francisco,US,Design with HTML/CSS,120,30,FALSE\n",
"Google,San Francisco,US,Restful API design,1500,120,FALSE\n",
"IBM,Berlin,EU,Restful API design,1500,120,TRUE\n"]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.