[英]How to Parse with Commas in CSV file in Ruby
我正在用Ruby解析CSV文件,但在定界符是逗號時遇到了麻煩,我的數據包含逗號。
在數據中包含逗號的部分中,數據用“”包圍,但是我不確定如何使CSV忽略引號中包含的逗號。
CSV數據示例(File.csv)
NCB 14591 BLK 13 LOT W IRR," 84.07 FT OF 25, ALL OF 26,",TWENTY-THREE SAC HOLDING COR
示例代碼:
require 'csv'
CSV.foreach("File.csv", encoding:'iso-8859-1:utf-8', :quote_char => "\x00").each do |x|
puts x[1]
end
電流輸出:“ 84.07 FT OF 25
預期產量:25英尺,84英尺,84.07英尺,
鏈接到要點以查看示例文件和代碼。 https://gist.github.com/markscoin/0d6c2d346d70fd627203317c5fe3097c
嘗試使用force_quotes選項:
require 'csv'
CSV.foreach("data.csv", encoding:'iso-8859-1:utf-8', quote_char: '"', force_quotes: true).each do |x|
puts x[1]
end
結果:
84.07金融時報25,全部26,
非法的報價錯誤是當一行中有引號,但它們沒有包裝整列時,例如,如果您的CSV看起來像這樣:
NCB 14591 BLK 13 LOT W IRR," 84.07 FT OF 25, ALL OF 26,",TWENTY-THREE SAC HOLDING COR
NCB 14592 BLK 14 LOT W IRR,84.07 FT OF "25",TWENTY-FOUR SAC HOLDING COR
您可以單獨解析每一行,並僅為使用錯誤引號的行更改引號字符:
require 'csv'
def parse_file(file_name)
File.foreach(file_name) do |line|
parse_line(line) do |x|
puts x.inspect
end
end
end
def parse_line(line)
options = { encoding:'iso-8859-1:utf-8' }
begin
yield CSV.parse_line(line, options)
rescue CSV::MalformedCSVError
# this line is misusing quotes, change the quote character and try again
options.merge! quote_char: "\x00"
retry
end
end
parse_file('./File.csv')
並運行它為您提供:
["NCB 14591 BLK 13 LOT W IRR", " 84.07 FT OF 25, ALL OF 26,", "TWENTY-THREE SAC HOLDING COR"]
["NCB 14592 BLK 14 LOT W IRR", "84.07 FT OF \"25\"", "TWENTY-FOUR SAC HOLDING COR"]
但是如果您在一行中混合使用了錯誤的報價和良好的報價,則會再次崩潰。 理想情況下,您只想清除CSV有效即可。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.