簡體   English   中英

C#-OutOfMemoryException將列表保存在JSON文件中

[英]C# - OutOfMemoryException saving a List on a JSON file

我正在嘗試保存壓力圖的流數據。 基本上,我的壓力矩陣定義為:

double[,] pressureMatrix = new double[e.Data.GetLength(0), e.Data.GetLength(1)];

基本上,我每10毫秒獲得一個pressureMatrix並且我想將所有信息保存在JSON文件中以便以后重現。

我要做的是,首先,使用用於進行錄制的所有設置來編寫我稱為標頭的內容,如下所示:

recordedData.softwareVersion = Assembly.GetExecutingAssembly().GetName().Version.Major.ToString() + "." + Assembly.GetExecutingAssembly().GetName().Version.Minor.ToString();
recordedData.calibrationConfiguration = calibrationConfiguration;
recordedData.representationConfiguration = representationSettings;
recordedData.pressureData = new List<PressureMap>();

var json = JsonConvert.SerializeObject(csvRecordedData, Formatting.None);

File.WriteAllText(this.filePath, json);

然后,每次獲得新的壓力圖時,我都會創建一個新的線程來添加新的PressureMatrix並重寫該文件:

var newPressureMatrix = new PressureMap(datos, DateTime.Now);
recordedData.pressureData.Add(newPressureMatrix);
var json = JsonConvert.SerializeObject(recordedData, Formatting.None);
File.WriteAllText(this.filePath, json);

大約20-30分鍾后,我收到OutOfMemory異常,因為系統無法保存recordedData變量,因為其中的List<PressureMatrix>太大。

我該如何處理以保存數據? 我想保存24-48小時的信息。

您的基本問題是,您要將所有壓力圖樣本保存在內存中,而不是分別編寫每個樣本,然后將其垃圾回收。 更糟糕的是,您在兩個不同的地方這樣做:

  1. 您可以將整個樣本列表序列化為JSON字符串json然后再將字符串寫入文件。

    而是,如《 性能提示:優化內存使用》中所述 ,在這種情況下,應直接對文件進行序列化和反序列化。 有關如何執行此操作的說明,請參見對Json.NET可以序列化/反序列化到流或從流反序列化的 答案 還將JSON序列化為文件

  2. recordedData.pressureData = new List<PressureMap>(); 累積所有壓力圖樣本,然后在每次創建樣本時將其全部寫入。

    更好的解決方案是將每個樣本編寫一次,然后將其忘記,但是每個樣本都必須嵌套在JSON中的某些容器對象內,這使得如何做到這一點變得不明顯。

那么,如何解決問題2?

首先,讓我們如下修改數據模型,將標題數據划分為一個單獨的類:

public class PressureMap
{
    public double[,] PressureMatrix { get; set; }
}

public class CalibrationConfiguration 
{
    // Data model not included in question
}

public class RepresentationConfiguration 
{
    // Data model not included in question
}

public class RecordedDataHeader
{
    public string SoftwareVersion { get; set; }
    public CalibrationConfiguration CalibrationConfiguration { get; set; }
    public RepresentationConfiguration RepresentationConfiguration { get; set; }
}

public class RecordedData
{
    // Ensure the header is serialized first.
    [JsonProperty(Order = 1)]
    public RecordedDataHeader RecordedDataHeader { get; set; }
    // Ensure the pressure data is serialized last.
    [JsonProperty(Order = 2)]
    public IEnumerable<PressureMap> PressureData { get; set; }
}

選項#1生產者-消費者模式的版本。 它涉及兩個線程:一個線程生成PressureData樣本,另一個線程序列化RecordedData 第一個線程將生成樣本,並將其添加到傳遞給第二個線程的BlockingCollection<PressureMap>集合中。 然后,第二個線程將BlockingCollection<PressureMap>.GetConsumingEnumerable()序列化為RecordedData.PressureData的值。

以下代碼提供了執行此操作的框架:

var sampleCount = 400;    // Or whatever stopping criterion you prefer
var sampleInterval = 10;  // in ms

using (var pressureData = new BlockingCollection<PressureMap>())
{
    // Adapted from
    // https://docs.microsoft.com/en-us/dotnet/standard/collections/thread-safe/blockingcollection-overview
    // https://docs.microsoft.com/en-us/dotnet/api/system.collections.concurrent.blockingcollection-1?view=netframework-4.7.2

    // Spin up a Task to sample the pressure maps
    using (Task t1 = Task.Factory.StartNew(() =>
    {
        for (int i = 0; i < sampleCount; i++)
        {
            var data = GetPressureMap(i);
            Console.WriteLine("Generated sample {0}", i);
            pressureData.Add(data);
            System.Threading.Thread.Sleep(sampleInterval);
        }
        pressureData.CompleteAdding();
    }))
    {
        // Spin up a Task to consume the BlockingCollection
        using (Task t2 = Task.Factory.StartNew(() =>
        {
            var recordedDataHeader = new RecordedDataHeader
            {
                SoftwareVersion = softwareVersion,
                CalibrationConfiguration = calibrationConfiguration,
                RepresentationConfiguration = representationConfiguration,
            };

            var settings = new JsonSerializerSettings
            {
                ContractResolver = new CamelCasePropertyNamesContractResolver(),
            };

            using (var stream = new FileStream(this.filePath, FileMode.Create))
            using (var textWriter = new StreamWriter(stream))
            using (var jsonWriter = new JsonTextWriter(textWriter))
            {
                int j = 0;

                var query = pressureData
                    .GetConsumingEnumerable()
                    .Select(p => 
                            { 
                                // Flush the writer periodically in case the process terminates abnormally
                                jsonWriter.Flush();
                                Console.WriteLine("Serializing item {0}", j++);
                                return p;
                            });

                var recordedData = new RecordedData
                {
                    RecordedDataHeader = recordedDataHeader,
                    // Since PressureData is declared as IEnumerable<PressureMap>, evaluation will be lazy.
                    PressureData = query,
                };                          

                Console.WriteLine("Beginning serialization of {0} to {1}:", recordedData, this.filePath);
                JsonSerializer.CreateDefault(settings).Serialize(textWriter, recordedData);
                Console.WriteLine("Finished serialization of {0} to {1}.", recordedData, this.filePath);
            }
        }))
        {
            Task.WaitAll(t1, t2);
        }
    }
}

筆記:

  • 該解決方案利用了以下事實:在序列化IEnumerable<T> ,Json.NET 不會將可枚舉實現為列表。 取而代之的是,它將充分利用惰性評估,並簡單地枚舉它,編寫然后忘記遇到的每個項目。

  • 第一個線程對PressureData采樣,並將它們添加到阻塞集合中。

  • 第二個線程將阻塞集合包裝在IEnumerable<PressureData>然后將其序列化為RecordedData.PressureData

    在序列化期間,序列化程序將枚舉IEnumerable<PressureData>枚舉,將每個流傳輸到JSON文件,然后繼續進行下一個-有效地阻塞直到一個可用。

  • 您將需要進行一些實驗,以確保序列化線程可以“跟上”采樣線程,這可能是通過在構造過程中設置BoundedCapacity 如果沒有,您可能需要采用其他策略。

  • PressureMap GetPressureMap(int count)應該是您的某種方法(問題中未顯示),該方法返回當前壓力圖樣本。

  • 在此技術中,JSON文件在采樣會話期間保持打開狀態。 如果采樣異常終止,則文件可能會被截斷。 我嘗試通過定期刷新編寫器來緩解此問題。

  • 盡管數據序列化將不再需要無限制的內存量,但稍后反序列化RecordedData會將反序列化PressureData組成具體的List<PressureMap> 這可能在下游處理期間導致內存問題。

演示小提琴#1 在這里

選項#2是從JSON文件切換到以換行符分隔的JSON文件。 這樣的文件由用換行符分隔的JSON對象序列組成。 在您的情況下,您將使第一個對象包含RecordedDataHeader信息,而隨后的對象將成為PressureMap類型:

var sampleCount = 100; // Or whatever
var sampleInterval = 10;

var recordedDataHeader = new RecordedDataHeader
{
    SoftwareVersion = softwareVersion,
    CalibrationConfiguration = calibrationConfiguration,
    RepresentationConfiguration = representationConfiguration,
};

var settings = new JsonSerializerSettings
{
    ContractResolver = new CamelCasePropertyNamesContractResolver(),
};

// Write the header
Console.WriteLine("Beginning serialization of sample data to {0}.", this.filePath);

using (var stream = new FileStream(this.filePath, FileMode.Create))
{
    JsonExtensions.ToNewlineDelimitedJson(stream, new[] { recordedDataHeader });
}

// Write each sample incrementally

for (int i = 0; i < sampleCount; i++)
{
    Thread.Sleep(sampleInterval);
    Console.WriteLine("Performing sample {0} of {1}", i, sampleCount);
    var map = GetPressureMap(i);

    using (var stream = new FileStream(this.filePath, FileMode.Append))
    {
        JsonExtensions.ToNewlineDelimitedJson(stream, new[] { map });
    }
}

Console.WriteLine("Finished serialization of sample data to {0}.", this.filePath);

使用擴展方法:

public static partial class JsonExtensions
{
    // Adapted from the answer to
    // https://stackoverflow.com/questions/44787652/serialize-as-ndjson-using-json-net
    // by dbc https://stackoverflow.com/users/3744182/dbc
    public static void ToNewlineDelimitedJson<T>(Stream stream, IEnumerable<T> items)
    {
        // Let caller dispose the underlying stream 
        using (var textWriter = new StreamWriter(stream, new UTF8Encoding(false, true), 1024, true))
        {
            ToNewlineDelimitedJson(textWriter, items);
        }
    }

    public static void ToNewlineDelimitedJson<T>(TextWriter textWriter, IEnumerable<T> items)
    {
        var serializer = JsonSerializer.CreateDefault();

        foreach (var item in items)
        {
            // Formatting.None is the default; I set it here for clarity.
            using (var writer = new JsonTextWriter(textWriter) { Formatting = Formatting.None, CloseOutput = false })
            {
                serializer.Serialize(writer, item);
            }
            // http://specs.okfnlabs.org/ndjson/
            // Each JSON text MUST conform to the [RFC7159] standard and MUST be written to the stream followed by the newline character \n (0x0A). 
            // The newline charater MAY be preceeded by a carriage return \r (0x0D). The JSON texts MUST NOT contain newlines or carriage returns.
            textWriter.Write("\n");
        }
    }

    // Adapted from the answer to 
    // https://stackoverflow.com/questions/29729063/line-delimited-json-serializing-and-de-serializing
    // by Yuval Itzchakov https://stackoverflow.com/users/1870803/yuval-itzchakov
    public static IEnumerable<TBase> FromNewlineDelimitedJson<TBase, THeader, TRow>(TextReader reader)
        where THeader : TBase
        where TRow : TBase
    {
        bool first = true;

        using (var jsonReader = new JsonTextReader(reader) { CloseInput = false, SupportMultipleContent = true })
        {
            var serializer = JsonSerializer.CreateDefault();

            while (jsonReader.Read())
            {
                if (jsonReader.TokenType == JsonToken.Comment)
                    continue;
                if (first)
                {
                    yield return serializer.Deserialize<THeader>(jsonReader);
                    first = false;
                }
                else
                {
                    yield return serializer.Deserialize<TRow>(jsonReader);
                }
            }
        }
    }
}

稍后,您可以按以下方式處理換行符分隔的JSON文件:

using (var stream = File.OpenRead(filePath))
using (var textReader = new StreamReader(stream))
{
    foreach (var obj in JsonExtensions.FromNewlineDelimitedJson<object, RecordedDataHeader, PressureMap>(textReader))
    {
        if (obj is RecordedDataHeader)
        {
            var header = (RecordedDataHeader)obj;
            // Process the header
            Console.WriteLine(JsonConvert.SerializeObject(header));
        }
        else
        {
            var row = (PressureMap)obj;
            // Process the row.
            Console.WriteLine(JsonConvert.SerializeObject(row));
        }
    }
}

筆記:

  • 這種方法看起來更簡單,因為樣本是遞增地添加到文件末尾的,而不是插入到整個JSON容器中。

  • 使用這種方法,可以使用有限的內存來完成序列化和下游處理。

  • 樣本文件在采樣期間不會保持打開狀態,因此被截斷的可能性較小。

  • 下游應用程序可能沒有內置工具來處理換行符分隔的JSON。

  • 該策略可以更簡單地與您當前的線程代碼集成。

演示小提琴#2 在這里

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM