简体   繁体   English

组合多个文本文件的有效方法

[英]Efficient way to combine multiple text files

I have multiple files of text that I need to read and combine into one file.我有多个文本文件需要阅读并合并到一个文件中。 The files are of varying size: 1 - 50 MB each.这些文件大小不一:每个 1 - 50 MB。 What's the most efficient way to combine these files without bumping into the dreading System.OutofMemoryException ?在不遇到可怕的System.OutofMemoryException的情况下组合这些文件的最有效方法是什么?

Do it in chunks:分块做:

const int chunkSize = 2 * 1024; // 2KB
var inputFiles = new[] { "file1.dat", "file2.dat", "file3.dat" };
using (var output = File.Create("output.dat"))
{
    foreach (var file in inputFiles)
    {
        using (var input = File.OpenRead(file))
        {
            var buffer = new byte[chunkSize];
            int bytesRead;
            while ((bytesRead = input.Read(buffer, 0, buffer.Length)) > 0)
            {
                output.Write(buffer, 0, bytesRead);
            }
        }
    }
}

Darin is on the right track.达林走在了正确的轨道上。 My tweak would be:我的调整是:

using (var output = File.Create("output"))
{
    foreach (var file in new[] { "file1", "file2" })
    {
        using (var input = File.OpenRead(file))
        {
            input.CopyTo(output);
        }
    }
}

This is code used above for.Net 4.0, but compatible with.Net 2.0 (for text files)这是上面用于 .Net 4.0 的代码,但与 .Net 2.0 兼容(用于文本文件)

using (var output = new StreamWriter("D:\\TMP\\output"))
{
  foreach (var file in Directory.GetFiles("D:\\TMP", "*.*"))
  {
    using (var input = new StreamReader(file))
    {
      output.WriteLine(input.ReadToEnd());
    }
  }
}

Please note that this will read the entire file in memory at once.请注意,这将一次读取 memory 中的整个文件。 This means that large files will cause a lot of memory to be used (and if not enough memory is available, it may fail all together).这意味着大文件将导致使用大量 memory (如果没有足够的 memory 可用,它可能会一起失败)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM