简体   繁体   English

如何高效地将大型二维数组写入 CSV 文件

[英]How to write Efficiently write Large 2D arrays to CSV file

EDIT: Using the solution provided, I was able to get it to work (not the most efficient but it works) using the following:编辑:使用提供的解决方案,我能够使用以下方法使其工作(不是最有效但有效):

    public static void WriteArray(object[,] dataArray, string filePath, bool deleteFileIfExists = true)
    {
        if (deleteFileIfExists && File.Exists(filePath))
        {
            File.Delete(filePath);
        }

        byte columnCount = (byte) dataArray.GetLength(1);
        int rowCount = dataArray.GetLength(0);

        for(int row = ReadExcel.ExcelIndex; row <= rowCount; row++)
        {
            List<string> stringList = new List<string>();
            for(byte column = ReadExcel.ExcelIndex; column <= columnCount; column++)
            {
                stringList.Add(dataArray[row, column].ToString());
            }
            string rowAsString = StringifyRow(stringList.ToArray());
            WriteLine(filePath, rowAsString);
        }
    }

    public static string StringifyRow(string[] row)
    {
        return string.Join(",", row);
    }

    public static void WriteLine(string filePath, string rowAsString)
    {
        using (StreamWriter writer = new StreamWriter(filePath, append: true))
        {
            writer.WriteLine(rowAsString);
            writer.Flush();
        }
    }

I am working on a data consolidation application.我正在开发一个数据整合应用程序。 This app will read in data as 2D arrays, manipulate some of the columns in memory, consolidate arrays, then write to a csv file.这个应用程序将数据作为二维数组读入,操作内存中的一些列,合并数组,然后写入一个 csv 文件。

I am working on a test method of writing to CSV.我正在研究写入 CSV 的测试方法。 Currently my array is just over 70K rows and 29 columns.目前我的数组刚好超过 70K 行和 29 列。 In production, it will always be at least 4 times as many rows.在生产中,它总是至少是行数的 4 倍。 For the test, I did not consolidate any arrays.对于测试,我没有合并任何数组。 This is just from one data source.这只是来自一个数据源。

In order to write to CSV, I have tried numerous methods:为了写入CSV,我尝试了多种方法:

  1. Pass the array (object[,]) to a helper method that loops through each row, creates a string from the values, then appends the string to a list, finally passes the list to the csvwriter to write the list of strings to a file.将数组 (object[,]) 传递给循环遍历每一行的辅助方法,根据值创建一个字符串,然后将字符串附加到列表中,最后将列表传递给 csvwriter 以将字符串列表写入文件. I get OutOfMemory errors when creating the list of strings.创建字符串列表时出现 OutOfMemory 错误。
  2. The same as above but with StringBuilder and no List, just a giant string with stringbuilder (found on .netperls) - same error - OutofMemory exception与上面相同,但使用 StringBuilder 而没有 List,只是一个带有 stringbuilder 的巨大字符串(在 .netperls 上找到) - 同样的错误 - OutofMemory 异常
  3. No helper method, just pass the array to the csvwriter method, loop through each row, build a string (I've tried this with string builder as well) and write each line one by one to the file.没有辅助方法,只需将数组传递给 csvwriter 方法,遍历每一行,构建一个字符串(我也用字符串构建器尝试过这个)并将每一行一行一行地写入文件。 I get an OutOfMemory exception.我收到 OutOfMemory 异常。

Some of my code (some various methods have been commented out) is below:我的一些代码(一些各种方法已被注释掉)如下:

using System.IO;
using System.Text;
using System.Collections.Generic;
namespace ExcelIO
{
    public static class CsvWriter
    {
        public static void WriteStringListCsv(List<string> data, string filePath, bool deleteIfExists = true)
        {
            if (deleteIfExists && File.Exists(filePath))
            {
                File.Delete(filePath);
            }

            //foreach(string record in data)
            //{
            //    File.WriteAllText(filePath, record);
            //}

            using (StreamWriter outfile = new StreamWriter(filePath))
            {
                foreach (string record in data)
                {
                    //trying to write data to csv
                    outfile.WriteLine(record);
                }
            }
        }

        public static void WriteDataArrayToCsv(List<string> data, string filePath, bool deleteIfExists = true)
        {
            if (deleteIfExists && File.Exists(filePath))
            {
                File.Delete(filePath);
            }

            //foreach(string record in data)
            //{
            //    File.WriteAllText(filePath, record);
            //}

            //using (StreamWriter outfile = new StreamWriter(filePath))
            //{
            //    File.WriteAllText(filePath, )
            //}

            using (StreamWriter outfile = new StreamWriter(filePath))
            {
                foreach(string record in data)
                {
                    outfile.WriteLine(record);
                }
            }
        }


        public static List<string> ConvertArrayToStringList(object[,] dataArray, char delimiter=',', bool includeHeaders = true)
        {
            List<string> dataList = new List<string>();
            byte colCount = (byte)dataArray.GetLength(1);
            int rowCount = dataArray.GetLength(0);

            int startingIndex = includeHeaders ? ReadExcel.ExcelIndex : ReadExcel.ExcelIndex + 1;

            //StringBuilder dataAsString = new StringBuilder();
            for (int rowIndex = startingIndex; rowIndex <= rowCount; rowCount++)
            {
                StringBuilder rowAsString = new StringBuilder();
                //string rowAsString = "";
                for (byte colIndex = ReadExcel.ExcelIndex; colIndex <= colCount; colIndex++)
                {

                    //rowAsString += dataArray[rowIndex, colIndex];
                    //rowAsString += (colIndex == colCount) ? "" : delimiter.ToString();
                    // Wrap in Quotes
                    rowAsString.Append($"\"{dataArray[rowIndex, colIndex]}\"");
                    if (colIndex == colCount)
                    {
                        rowAsString.Append(delimiter.ToString());
                    }
                }
                // Move to nextLine
                //dataAsString.AppendLine();
                //outfile.WriteLine(rowAsString);
                dataList.Add(rowAsString.ToString());
            }
            //return dataAsString.ToString();
            return dataList;
        }
    }
}

I've tried everything I've seen from searching online but everything gives me an OutOfMemroy exception when piecing together the rows (even just doing one row at a time and writing that).我已经尝试了从在线搜索中看到的所有内容,但是在将行拼凑在一起时,一切都给了我一个 OutOfMemroy 异常(即使一次只做一行并编写它)。 Is there a better way to efficiently and hopefully quickly write a large 2D array to a csv file?有没有更好的方法可以有效地并希望快速地将大型二维数组写入 csv 文件?

Any and all tips are greatly appreciated.非常感谢任何和所有提示。

With code like below I've never had any memory issues, even with huge data使用如下代码,即使数据量很大,我也从未遇到任何内存问题

    class Program
    {
        static void Main(string[] args)
        {
            StreamWriter writer = new StreamWriter("Filename");
            List<Record> data = new List<Record>();

            foreach (Record record in data)
            {
                string line = string.Join(",", record.field1, record.field2, record.field3, record.field4, record.field5);
                writer.WriteLine(line);
            }
            writer.Flush();
            writer.Close();
        }
    }
    public class Record
    {
        public string field1 { get; set; }
        public string field2 { get; set; }
        public string field3 { get; set; }
        public string field4 { get; set; }
        public string field5 { get; set; }

    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM