简体   繁体   English

C# MongoDb 从 stream 插入整个集合

[英]C# MongoDb insert entire collection from a stream

I have a process that archives MongoDb collections by getting an IAsyncCursor and writing the raw bytes out to an Azure Blob stream. I have a process that archives MongoDb collections by getting an IAsyncCursor and writing the raw bytes out to an Azure Blob stream. This seems to be quite efficient and works.这似乎非常有效并且有效。 Here is the working code.这是工作代码。

var cursor = await clientDb.GetCollection<RawBsonDocument>(collectionPath).Find(new BsonDocument()).ToCursorAsync();
while (cursor.MoveNext())
    foreach (var document in cursor.Current)
    {
        var bytes = new byte[document.Slice.Length];
        document.Slice.GetBytes(0, bytes, 0, document.Slice.Length);
        blobStream.Write(bytes, 0, bytes.Length);
    }

However, in order to move this data from the archive back into MongoDb, the only way I've figured out how to do it is to load the entire raw byte array into a memory stream and then .InsertOneAsync() in to MongoDb. However, in order to move this data from the archive back into MongoDb, the only way I've figured out how to do it is to load the entire raw byte array into a memory stream and then .InsertOneAsync() in to MongoDb. This does work fine for smaller collections, but for very large collections I'm getting MongoDb errors.这对于较小的 collections 确实有效,但对于非常大的 collections 我收到 MongoDb 错误。 Also, this obviously isn't very memory efficient.此外,这显然不是很有效的 memory。 Is there any way to stream raw byte data into MongoDb, or use a cursor like I'm doing on the read?有什么方法可以将 stream 原始字节数据转换为 MongoDb,或者像我在读取时那样使用 cursor?

var rawRef = clientDb.GetCollection<RawBsonDocument>(collectionPath);
using (var ms = new MemoryStream())
{
    await stream.CopyToAsync(ms);
    var bytes = ms.ToArray();
    var rawBson = new RawBsonDocument(bytes);
    await rawRef.InsertOneAsync(rawBson);
}

Here is the error I get if the collection is too large.如果集合太大,这是我得到的错误。

MongoDB.Driver.MongoConnectionException : An exception occurred while sending a message to the server.
---- System.IO.IOException : Unable to write data to the transport connection: An established connection was aborted by the software in your host machine..
-------- System.Net.Sockets.SocketException : An established connection was aborted by the software in your host machine.

Instead of copying the stream as a whole to a byte-Array and parsing this to a RawBsonDocument , you can parse the documents one by one, eg:无需将 stream 整体复制到 byte-Array 并将其解析为RawBsonDocument ,您可以逐个解析文档,例如:

while (stream.Position < stream.Length)
{
    var rawBson = BsonSerializer.Deserialize<RawBsonDocument>(stream);
    await rawRef.InsertOneAsync(rawBson);
}

The stream will be read in chunks of one. stream 将以一个为单位读取。 Above sample inserts the documents directly into the database.上面的示例将文档直接插入到数据库中。 If you want to insert in batches, you can collect a reasonable amount of documents in a list and use InsertManyAsync .如果要批量插入,可以在列表中收集合理数量的文档并使用InsertManyAsync

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM