简体   繁体   English

内存流和大对象堆

[英]Memorystream and Large Object Heap

I have to transfer large files between computers on via unreliable connections using WCF. 我必须使用WCF通过不可靠的连接在计算机之间传输大文件。

Because I want to be able to resume the file and I don't want to be limited in my filesize by WCF, I am chunking the files into 1MB pieces. 因为我希望能够恢复该文件,并且我不希望受到WCF的文件大小限制,我将这些文件分块为1MB。 These "chunk" are transported as stream. 这些“块”以流的形式传输。 Which works quite nice, so far. 到目前为止哪个效果很好。

My steps are: 我的步骤是:

  1. open filestream 打开文件流
  2. read chunk from file into byte[] and create memorystream 从文件中读取块到byte []并创建内存流
  3. transfer chunk 转移块
  4. back to 2. until the whole file is sent 返回2.直到整个文件发送

My problem is in step 2. I assume that when I create a memory stream from a byte array, it will end up on the LOH and ultimately cause an outofmemory exception. 我的问题在第2步。我假设当我从字节数组创建一个内存流时,它将最终在LOH上并最终导致outofmemory异常。 I could not actually create this error, maybe I am wrong in my assumption. 我实际上无法创建此错误,也许我的假设是错误的。

Now, I don't want to send the byte[] in the message, as WCF will tell me the array size is too big. 现在,我不想在消息中发送byte [],因为WCF会告诉我数组大小太大。 I can change the max allowed array size and/or the size of my chunk, but I hope there is another solution. 我可以更改允许的最大数组大小和/或我的块的大小,但我希望有另一种解决方案。

My actual question(s): 我的实际问题:

  • Will my current solution create objects on the LOH and will that cause me problem? 我当前的解决方案是否会在LOH上创建对象并且会导致我的问题吗?
  • Is there a better way to solve this? 有没有更好的方法来解决这个问题?

Btw.: On the receiving side I simple read smaller chunks from the arriving stream and write them directly into the file, so no large byte arrays involved. 顺便说一句:在接收端,我简单地从到达流中读取较小的块并将它们直接写入文件,因此不涉及大字节数组。

Edit: 编辑:

current solution: 当前解决方案

for (int i = resumeChunk; i < chunks; i++)
{
 byte[] buffer = new byte[chunkSize];
 fileStream.Position = i * chunkSize;
 int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
 Array.Resize(ref buffer, actualLength);
 using (MemoryStream stream = new MemoryStream(buffer)) 
 {
  UploadFile(stream);
 }
}

I hope this is okay. 我希望这没关系。 It's my first answer on StackOverflow. 这是我在StackOverflow上的第一个答案。

Yes absolutely if your chunksize is over 85000 bytes then the array will get allocated on the large object heap. 是绝对的,如果你的chunksize超过85000字节,那么数组将在大对象堆上分配。 You will probably not run out of memory very quickly as you are allocating and deallocating contiguous areas of memory that are all the same size so when memory fills up the runtime can fit a new chunk into an old, reclaimed memory area. 您可能不会很快耗尽内存,因为您正在分配和释放大小相同的连续内存区域,因此当内存填满运行时可以将新块装入旧的回收内存区域。

I would be a little worried about the Array.Resize call as that will create another array (see http://msdn.microsoft.com/en-us/library/1ffy6686(VS.80).aspx ). 我会有点担心Array.Resize调用,因为这将创建另一个数组(请参阅http://msdn.microsoft.com/en-us/library/1ffy6686(VS.80).aspx )。 This is an unecessary step if actualLength==Chunksize as it will be for all but the last chunk. 如果actualLength == Chunksize,这是一个不必要的步骤,因为除了最后一个块之外,它将是所有的。 So I would as a minimum suggest: 所以我至少建议:

if (actualLength != chunkSize) Array.Resize(ref buffer, actualLength);

This should remove a lot of allocations. 这应该删除大量的分配。 If the actualSize is not the same as the chunkSize but is still > 85000 then the new array will also be allocated on the Large object heap potentially causing it to fragment and possibly causing apparent memory leaks. 如果actualSize与chunkSize不同但仍然> 85000,则新数组也将在Large对象堆上分配,可能导致它碎片并可能导致明显的内存泄漏。 It would I believe still take a long time to actually run out of memory as the leak would be quite slow. 我相信它仍然需要很长时间才能实际耗尽内存,因为泄漏会很慢。

I think a better implementation would be to use some kind of Buffer Pool to provide the arrays. 我认为更好的实现方法是使用某种缓冲池来提供数组。 You could roll your own (it would be too complicated) but WCF does provide one for you. 你可以自己动手(这太复杂了)但是WCF确实为你提供了一个。 I have rewritten your code slightly to take advatage of that: 我稍微重写了你的代码,以便采取以下措施:

BufferManager bm = BufferManager.CreateBufferManager(chunkSize * 10, chunkSize);

for (int i = resumeChunk; i < chunks; i++)
{
    byte[] buffer = bm.TakeBuffer(chunkSize);
    try
    {
        fileStream.Position = i * chunkSize;
        int actualLength = fileStream.Read(buffer, 0, (int)chunkSize);
        if (actualLength == 0) break;
        //Array.Resize(ref buffer, actualLength);
        using (MemoryStream stream = new MemoryStream(buffer))
        {
            UploadFile(stream, actualLength);
        }
    }
    finally
    {
        bm.ReturnBuffer(buffer);
    }
}

this assumes that the implementation of UploadFile Can be rewritten to take an int for the no. 这假设可以重写UploadFile的实现以获取no的int。 of bytes to write. 要写的字节数。

I hope this helps 我希望这有帮助

joe

See also RecyclableMemoryStream . 另请参见RecyclableMemoryStream From this article : 这篇文章

Microsoft.IO.RecyclableMemoryStream is a MemoryStream replacement that offers superior behavior for performance-critical systems. Microsoft.IO.RecyclableMemoryStream是一个MemoryStream替代品,可为性能关键型系统提供卓越的性能。 In particular it is optimized to do the following: 特别是它被优化以执行以下操作:

  • Eliminate Large Object Heap allocations by using pooled buffers 使用池化缓冲区消除大对象堆分配
  • Incur far fewer gen 2 GCs, and spend far less time paused due to GC 导致2代GC的数量要少得多,并且因GC而停顿的时间要少得多
  • Avoid memory leaks by having a bounded pool size 通过限制池大小避免内存泄漏
  • Avoid memory fragmentation 避免内存碎片
  • Provide excellent debuggability 提供出色的可调试性
  • Provide metrics for performance tracking 提供绩效跟踪指标

I'm not so sure about the first part of your question but as for a better way - have you considered BITS ? 我对你问题的第一部分不太确定,但是对于更好的方法 - 你有没有考虑过BITS It allows background downloading of files over http. 它允许通过http下载文件的后台。 You can provide it a http:// or file:// URI. 您可以提供http://或file:// URI。 It is resumable from the point that it was interrupted and downloads in chunks of bytes using the RANGE method in the http HEADER. 它可以从中断的角度恢复,并使用http HEADER中的RANGE方法以字节块的形式下载。 It is used by Windows Update.You can subscribe to events that give information on progress and completion. 它由Windows Update使用。您可以订阅提供有关进度和完成信息的事件。

I have come up with another solution for this, let me know what you think! 我想出了另一个解决方案,让我知道你的想法!

Since I don't want to have large amounts of data in the memory I was looking for an elegant way to temporary store byte arrays or a stream. 由于我不希望在内存中有大量数据,因此我一直在寻找一种优雅的临时存储字节数组或流的方法。

The idea is to create a temp file (you don't need specific rights to do this) and then use it similar to a memory stream. 我们的想法是创建一个临时文件(您不需要特定的权限来执行此操作),然后使用它类似于内存流。 Making the class Disposable will clean up the temp file after it has been used. 使类Disposable将在使用后清除临时文件。

public class TempFileStream : Stream
{
  private readonly string _filename;
  private readonly FileStream _fileStream;

  public TempFileStream()
  {
     this._filename = Path.GetTempFileName();
     this._fileStream = File.Open(this._filename, FileMode.OpenOrCreate, FileAccess.ReadWrite);
  }

  public override bool CanRead
  {
   get
    {
    return this._fileStream.CanRead;
    }
   }

// and so on with wrapping the stream to the underlying filestream

... ...

    // finally overrride the Dispose Method and remove the temp file     
protected override void Dispose(bool disposing)
  {
      base.Dispose(disposing);

  if (disposing)
  {
   this._fileStream.Close();
   this._fileStream.Dispose();

   try
   {
      File.Delete(this._filename);
   }
   catch (Exception)
   {
     // if something goes wrong while deleting the temp file we can ignore it.
   }
  }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM