简体   繁体   English

在Ruby中从CSV文件获取头文件的最简单方法是什么?

[英]What's the easiest way to get the headers from a CSV file in Ruby?

All I need to do is get the headers from a CSV file. 我需要做的就是从CSV文件中获取标题。

file.csv is: file.csv是:

"A", "B", "C"  
"1", "2", "3"

My code is: 我的代码是:

table = CSV.open("file.csv", :headers => true)

puts table.headers

table.each do |row|
  puts row 
end

Which gives me: 这给了我:

true
"1", "2", "3"

I've been looking at Ruby CSV documentation for hours and this is driving me crazy. 我已经看了几个小时的Ruby CSV文档,这让我发疯。 I am convinced that there must be a simple one-liner that can return the headers to me. 我确信必须有一个简单的单行程序可以将标题返回给我。 Any ideas? 有任何想法吗?

It looks like CSV.read will give you access to a headers method: 看起来像CSV.read将允许您访问headers方法:

headers = CSV.read("file.csv", headers: true).headers
# => ["A", "B", "C"]

The above is really just a shortcut for CSV.open("file.csv", headers: true).read.headers . 以上只是CSV.open("file.csv", headers: true).read.headers的快捷方式。 You could have gotten to it using CSV.open as you tried, but since CSV.open doesn't actually read the file when you call the method, there is no way for it to know what the headers are until it's actually read some data. 您可以尝试使用CSV.open来实现它,但由于CSV.open在您调用方法时实际上并未读取文件,因此在实际读取某些数据之前,它无法知道标题是什么。 This is why it just returns true in your example. 这就是为什么它只是在你的例子中返回true After reading some data, it would finally return the headers: 读完一些数据后,最终会返回标题:

  table = CSV.open("file.csv", :headers => true)
  table.headers
  # => true
  table.read
  # => #<CSV::Table mode:col_or_row row_count:2>
  table.headers
  # => ["A", "B", "C"]

In my opinion the best way to do this is: 在我看来,最好的方法是:

headers = CSV.foreach('file.csv').first

Please note that its very tempting to use CSV.read('file.csv'. headers: true).headers but the catch is, CSV.read loads complete file in memory and hence increases your memory footprint and as also it makes it very slow to use for bigger files. 请注意,使用CSV.read('file.csv'. headers: true).headers非常诱人,但问题是, CSV.read会在内存中加载完整的文件,因此会增加内存占用,同时也会使使用较慢的文件较慢。 Whenever possible please use CSV.foreach . 请尽可能使用CSV.foreach Below are the benchmarks for just a 20 MB file: 以下是仅20 MB文件的基准:

Ruby version: ruby 2.4.1p111 
File size: 20M  
****************
Time and memory usage with CSV.foreach:
Time: 0.0 seconds
Memory: 0.04 MB
****************
Time and memory usage with CSV.read:
Time: 5.88 seconds
Memory: 314.25 MB

A 20MB file increased memory footprint by 314 MB with CSV.read , imagine what a 1GB file will do to your system. 使用CSV.read ,一个20MB的文件将内存占用增加了314 MB,想象一下1GB文件将对您的系统做些什么。 In short please do not use CSV.read , i did and system went down for a 300MB file. 总之请不要使用CSV.read ,我做了,系统下降了300MB文件。

For further reading: If you want to read more about this, here is a very good article on handling big files. 如需进一步阅读:如果您想了解更多相关内容, 这里有一篇关于处理大文件的非常好的文章。

Also below is the script i used for benchmarking CSV.foreach and CSV.read : 以下是我用于对CSV.foreachCSV.read进行基准测试的脚本:

require 'benchmark'
require 'csv'
def print_memory_usage
  memory_before = `ps -o rss= -p #{Process.pid}`.to_i
  yield
  memory_after = `ps -o rss= -p #{Process.pid}`.to_i
  puts "Memory: #{((memory_after - memory_before) / 1024.0).round(2)} MB"
end

def print_time_spent
  time = Benchmark.realtime do
    yield
  end
  puts "Time: #{time.round(2)} seconds"
end

file_path = '{path_to_csv_file}'
puts 'Ruby version: ' + `ruby -v`
puts 'File size:' + `du -h #{file_path}`
puts 'Time and memory usage with CSV.foreach: '
print_memory_usage do
  print_time_spent do
    headers = CSV.foreach(file_path, headers: false).first
  end
end
puts 'Time and memory usage with CSV.read:'
print_memory_usage do
  print_time_spent do
    headers = CSV.read(file_path, headers: true).headers
  end
end

If you want a shorter answer then can try: 如果您想要更短的答案,那么可以尝试:

headers = CSV.open("file.csv", &:readline)
# => ["A", "B", "C"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Ruby 将 CSV 导出到 Excel 的最简单方法是什么? - What's the easiest way to export a CSV to Excel with Ruby? 如何从ruby中的CSV文件中获取标头 - how to get headers from a CSV file in ruby 在Ruby中获取Lisp样式加法(+ * args)的最简单方法是什么? - What's the easiest way to get Lisp-Style addition (+ *args) in Ruby? 在没有混淆输出的情况下,在Ruby中从并行操作打印输出的最简单方法是什么? - What's the easiest way to print output from parallel operations in Ruby without jumbling up the output? 在 Ruby 中,在字符串的开头而不是结尾处“chomp”的最简单方法是什么? - In Ruby, what's the easiest way to “chomp” at the start of a string instead of the end? 使用Ruby通过Outlook发送消息的最简单方法是什么? - What's the easiest way to send a message through Outlook with Ruby? 在本地设置红宝石/铁轨沙箱的最简单方法是什么? - what's the easiest way to setup a ruby/rails sandbox locally? Ruby:从数组中删除第一个元素的最简单方法是什么? - Ruby: What is the easiest way to remove the first element from an array? 在我的服务器上安装 Ruby 的最简单方法是什么 - What is the easiest way to install Ruby on my server 使用ruby读取和编辑CSV文件时使用标题 - Using headers when reading from and editing CSV file with ruby
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM