在 Ruby 中分塊讀取文件

Question

我需要讀取 MB 塊中的文件，在 Ruby 中是否有更簡潔的方法來執行此操作：

FILENAME="d:\\tmp\\file.bin"
MEGABYTE = 1024*1024
size = File.size(FILENAME)
open(FILENAME, "rb") do |io| 
  read = 0
  while read < size
    left = (size - read)
    cur = left < MEGABYTE ? left : MEGABYTE
    data = io.read(cur)
    read += data.size
    puts "READ #{cur} bytes" #yield data
  end
end

Answer 1

改編自Ruby Cookbook第204頁：

FILENAME = "d:\\tmp\\file.bin"
MEGABYTE = 1024 * 1024

class File
  def each_chunk(chunk_size = MEGABYTE)
    yield read(chunk_size) until eof?
  end
end

open(FILENAME, "rb") do |f|
  f.each_chunk { |chunk| puts chunk }
end

免責聲明：我是一個紅寶石新手，並沒有測試過這個。

Answer 2

或者，如果您不想monkeypatch File ：

until my_file.eof?
  do_something_with( my_file.read( bytes ) )
end

例如，將上載的臨時文件流式傳輸到新文件中：

# tempfile is a File instance
File.open( new_file, 'wb' ) do |f|
  # Read in small 65k chunks to limit memory usage
  f.write(tempfile.read(2**16)) until tempfile.eof?
end

Answer 3

您可以使用IO#each(sep, limit) ，並將sep設置為nil或空字符串，例如：

chunk_size = 1024
File.open('/path/to/file.txt').each(nil, chunk_size) do |chunk|
  puts chunk
end

Answer 4

如果您查看ruby文檔： http ： //ruby-doc.org/core-2.2.2/IO.html，有一行如下：

IO.foreach("testfile") {|x| print "GOT ", x }

唯一需要注意的是。 由於此進程可以比生成的流IMO更快地讀取臨時文件，因此應該引入延遲。

IO.foreach("/tmp/streamfile") {|line|
  ParseLine.parse(line)
  sleep 0.3 #pause as this process will discontine if it doesn't allow some buffering 
}

Answer 5

https://ruby-doc.org/core-3.0.2/IO.html#method-i-read給出了一個使用read( length )迭代固定長度記錄的例子：

# iterate over fixed length records
open("fixed-record-file") do |f|
  while record = f.read(256)
    # ...
  end
end

如果length是正數 integer， read會嘗試讀取length字節而不進行任何轉換（二進制模式）。 如果在讀取任何內容之前遇到 EOF，則返回nil 。 如果在讀取期間遇到 EOF，則返回的字節少於長度。 在 integer長度的情況下，生成的字符串始終采用 ASCII-8BIT 編碼。

Answer 6

FILENAME="d:/tmp/file.bin"

class File
  MEGABYTE = 1024*1024

  def each_chunk(chunk_size=MEGABYTE)
    yield self.read(chunk_size) until self.eof?
  end
end

open(FILENAME, "rb") do |f|
  f.each_chunk {|chunk| puts chunk }
end

它有效， mbarkhau 。 我只是將常量定義移動到File類，並為了清晰起見添加了一些“self”。

在 Ruby 中分塊讀取文件

問題描述

6 個解決方案

解決方案1
20 已采納 2009-11-05 17:57:33

解決方案2
10 2012-03-06 16:48:24

解決方案3
2 2016-12-03 07:15:11

解決方案4
0 2015-12-30 12:31:54

解決方案5
0 2021-08-23 23:29:20

解決方案6
-1 2009-11-05 19:38:08

在 Ruby 中分塊讀取文件

問題描述

6 個解決方案

解決方案1 20 已采納 2009-11-05 17:57:33

解決方案2 10 2012-03-06 16:48:24

解決方案3 2 2016-12-03 07:15:11

解決方案4 0 2015-12-30 12:31:54

解決方案5 0 2021-08-23 23:29:20

解決方案6 -1 2009-11-05 19:38:08

解決方案1
20 已采納 2009-11-05 17:57:33

解決方案2
10 2012-03-06 16:48:24

解決方案3
2 2016-12-03 07:15:11

解決方案4
0 2015-12-30 12:31:54

解決方案5
0 2021-08-23 23:29:20

解決方案6
-1 2009-11-05 19:38:08