简体   繁体   English

Ruby CSV - 获取当前行/行号

[英]Ruby CSV - get current line/row number

I'm trying to work out how to get the current line/row number from Ruby CSV.我正在尝试研究如何从 Ruby CSV 获取当前行/行号。 This is my code:这是我的代码:

options = {:encoding => 'UTF-8', :skip_blanks => true}
CSV.foreach("data.csv", options, ) do |row, i|
   puts i
end

But this doesn't seem to work as expected.但这似乎并没有按预期工作。 Is there a way to do this?有没有办法做到这一点?

Because of changes in CSV in current Rubies, we need to make some changes.由于当前 Ruby 中 CSV 的更改,我们需要进行一些更改。 See farther down in the answer for the original solution with Ruby prior to 2.6.在 2.6 之前使用 Ruby 的原始解决方案的答案中进一步查看。 and the use of with_index which continues to work regardless of the version.以及使用with_index无论版本如何都可以继续工作。

For 2.6+ this'll work:对于 2.6+,这将起作用:

require 'csv'

puts RUBY_VERSION

csv_file = CSV.open('test.csv')
csv_file.each do |csv_row|
  puts '%i %s' % [csv_file.lineno, csv_row]
end
csv_file.close

If I read:如果我读:

Year,Make,Model,Description,Price
1997,Ford,E350,"ac, abs, moon",3000.00
1999,Chevy,"Venture ""Extended Edition""","",4900.00
1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00
1996,Jeep,Grand Cherokee,"MUST SELL!\nair, moon roof, loaded",4799.00

The code results in this output:代码导致此输出:

2.6.3
1 ["Year", "Make", "Model", "Description", "Price"]
2 ["1997", "Ford", "E350", "ac, abs, moon", "3000.00"]
3 ["1999", "Chevy", "Venture \"Extended Edition\"", "", "4900.00"]
4 ["1999", "Chevy", "Venture \"Extended Edition, Very Large\"", "", "5000.00"]
5 ["1996", "Jeep", "Grand Cherokee", "MUST SELL!\\nair, moon roof, loaded", "4799.00"]

The change is because we have to get access to the current file handle.更改是因为我们必须访问当前文件句柄。 Previously we could use the global $.以前我们可以使用全局$. , which always had a possibility of failure because globals can get stomped on by other sections of called code. ,它总是有失败的可能性,因为全局变量可能会被调用代码的其他部分踩到。 If we have the handle of the file being opened, then we can use lineno without that concern.如果我们有正在打开的文件的句柄,那么我们可以使用lineno而不用担心。


$.

Ruby prior to 2.6 would let us do this: 2.6 之前的 Ruby 会让我们这样做:

Ruby has a magic variable $. Ruby 有一个神奇的变量$. which is the line number of the current file being read:这是正在读取的当前文件的行号:

require 'csv'

CSV.foreach('test.csv') do |csv|
  puts $.
end

with the code above, I get:使用上面的代码,我得到:

1
2
3
4
5

$INPUT_LINE_NUMBER

$. is used all the time in Perl.在 Perl 中一直使用。 In Ruby, it's recommended we use it the following way to avoid the "magical" side of it:在 Ruby 中,建议我们按以下方式使用它以避免其“神奇”的一面:

require 'english'

puts $INPUT_LINE_NUMBER

If it's necessary to deal with embedded line-ends in fields, it's easily handled by a minor modification.如果需要处理字段中嵌入的行尾,只需稍作修改即可轻松处理。 Assuming a CSV file "test.csv" which contains a line with an embedded new-line:假设一个 CSV 文件“test.csv”包含一行嵌入换行符:

Year,Make,Model,Description,Price
1997,Ford,E350,"ac, abs, moon",3000.00
1999,Chevy,"Venture ""Extended Edition""","",4900.00
1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00
1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00

with_index

Using Enumerator's with_index(1) makes it easy to keep track of the number of times CSV yields to the block, effectively simulating using $.使用 Enumerator 的with_index(1)可以轻松跟踪 CSV 生成块的次数,有效地模拟使用$. but honoring CSV's work when reading the extra lines necessary to deal with the line-ends:但是在阅读处理行尾所需的额外行时尊重 CSV 的工作:

require 'csv'

CSV.foreach('test.csv', headers: true).with_index(1) do |row, ln|
  puts '%-3d %-5s %-26s %s' % [ln, *row.values_at('Make', 'Model', 'Description')]
end

Which, when run, outputs:运行时,输出:

$ ruby test.rb
1   Ford  E350                       ac, abs, moon
2   Chevy Venture "Extended Edition"
3   Jeep  Grand Cherokee             MUST SELL!
air, moon roof, loaded
4   Chevy Venture "Extended Edition, Very Large"

Here's an alternative solution:这是一个替代解决方案:

options = {:encoding => 'UTF-8', :skip_blanks => true}

CSV.foreach("data.csv", options).with_index do |row, i|
   puts i
end

Not a clean but a simple solution不是一个干净的而是一个简单的解决方案

options = {:encoding => 'UTF-8', :skip_blanks => true}
i = 0
CSV.foreach("data.csv", options) do | row |
  puts i
  i += 1
end

Ruby 2.6+红宝石 2.6+

Without Headers没有标题

CSV.foreach( "data.csv", encoding: "UTF-8" ).with_index do |row, row_number|
  puts row_number
end

With Headers带标题

CSV.foreach( "data.csv", encoding: "UTF-8", headers: true ).with_index( 2 ) do |row, row_number|
  puts row_number # Starts at row 2, which is the first row after the header row.
end

In Ruby 2.6, $INPUT_LINE_NUMBER no longer gives you the current line number.在 Ruby 2.6 中, $INPUT_LINE_NUMBER不再提供当前行号。 What's worse is that it's returning values of 2 and 1 .更糟糕的是,它返回21值。 I'm not sure what that is supposed to represent but it's certainly not the row number.我不确定那应该代表什么,但肯定不是行号。 Since it doesn't raise an exception, it can really bite you if you're not checking that value.由于它不会引发异常,如果您不检查该值,它真的会咬您。 I highly recommend you replace all occurrences of $INPUT_LINE_NUMBER in your code to avoid this gotcha.我强烈建议您替换代码中所有出现的$INPUT_LINE_NUMBER以避免出现这种问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM