简体   繁体   English

使用C#读取Excel文件和写入文本文件非常慢

[英]Reading an Excel file and writing to text file very slow using c#

I have the following code in ac# wpf application. 我在ac#wpf应用程序中有以下代码。 I am reading an Excel file, removing hidden characters and trying to retain the cell formatting, and then writing the data to a pipe delimited text file. 我正在读取Excel文件,删除隐藏的字符并尝试保留单元格格式,然后将数据写入管道分隔的文本文件。 This code looks very straight forward but is very slow. 这段代码看起来很简单但是很慢。 Any ideas on why and how I can improve the process? 关于为什么以及如何改善流程的任何想法?

    private void ReadWriteExcelData(string strFileName)
    {
        Excel.Application xlApp;
        Excel.Workbook xlWorkBook;
        Excel.Worksheet xlWorkSheet;
        Excel.Range range, colrange, rowrange;

        xlApp = new Excel.Application();
        xlWorkBook = xlApp.Workbooks.Open(strFileName, 0, true, 5, "", "", true,
            Excel.XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);

        Excel.Sheets excelSheets = xlWorkBook.Worksheets;
        if (blnLetExcelDecide)
            {
                range = xlWorkSheet.UsedRange;
            }
            else
            {
                Excel.Range c1 = xlWorkSheet.Cells[lngExcelStartRow, strExcelStartCol];
                Excel.Range c2 = xlWorkSheet.Cells[lngExcelEndRow, strExcelEndCol];
                range = (Excel.Range)xlWorkSheet.get_Range(c1, c2);
            }

            colrange = range.Columns;
            lngNumCols = colrange.Count;
            rowrange = range.Rows;
            lngNumRows = rowrange.Count;

            object[,] values = (object[,])range.Value;
            string[] Fields = new string[lngNumCols];
            int NumRow = 1;

            while (NumRow <= values.GetLength(0))
            {
                strDataRow = "";

                for (lngColCnt = 1; lngColCnt <= lngNumCols; lngColCnt++)
                {
                    strCellData = range[NumRow, lngColCnt].Text;
                    strCellData = strCellData.TrimStart(' ');


                    if (strCellData == null)
                    {
                        strCellData = string.Empty;
                    }
                    else
                    {
                        strCellData = strCellData.Replace("\r\n", " ").Replace("\n", " ").Replace("\r", " ");
                    }

                    if (lngColCnt == lngNumCols)
                    {
                        strDataRow += strCellData;
                    }
                    else
                    {
                        strDataRow += strCellData + "|";
                    }
                }

                WriteDataRow(strDataRow, strFullOutputFileName);

                if (NumRow % intModNumber == 0)
                {
                    dblProgressPct = ((double)NumRow / (double)lngNumRows);
                    dblProgress = Math.Round((dblProgressPct * 100), 0);
                    prgIndicator.Width = dblProgress * 4;
                    lblPrctPrgrs.Content = dblProgress + "%";

                    grdProgressIndicator.InvalidateVisual();
                    System.Windows.Forms.Application.DoEvents();
                }

                NumRow++;
            }
       }  

Here is the WriteDataRow routine: 这是WriteDataRow例程:

    public void WriteDataRow(string strDataRow, string strFullFileName)
    {
        using (StreamWriter file = new StreamWriter(@strFullFileName, true, Encoding.GetEncoding("iso-8859-1")))
        {
            file.WriteLine(strDataRow);
        }
    }

Here's one approach involving using some VBA to read all of the cells' Text values. 这是一种涉及使用某些VBA读取所有单元格的Text值的方法。

First, create an xlsm file containing this function in a regular module: 首先,在常规模块中创建一个包含此功能的xlsm文件:

Public Function GetText(strWB As String, strSheet As String, _
                        strAddress As String) As Variant()
    Dim rng As Range, arr() As Variant, r As Long, c As Long
    Set rng = Workbooks(strWB).Worksheets(strSheet).Range(strAddress)
    rng.Columns.AutoFit 'avoid getting "######" !
    ReDim arr(0 To rng.Rows.Count - 1, 0 To rng.Columns.Count - 1)
    For r = 1 To rng.Rows.Count
    For c = 1 To rng.Columns.Count
        arr(r - 1, c - 1) = rng.Cells(r, c).Text
    Next c
    Next r
    GetText = arr
End Function

After opening your data file, open the file with the macro: 打开数据文件后,使用宏打开文件:

Excel.Workbook xlCodeWb = xlApp.Workbooks.Open(@"D:\Folder\Stuff\TheMacro.xlsm"); 

Then call the macro: 然后调用宏:

object[,] values = xlApp.Run("'" + xlCodeWb.Name + "'!GetText", 
                   xlWorkBook.Name, xlWorkSheet.Name, range.Address); 

values is now a 2D array of all of the Text values from the sheet, without the overhead of having to pick out each one in a separate call across the process boundaries. 现在, values是工作表中所有Text值的2D数组,而不必在跨过程边界的单独调用中挑选每个值。 You can iterate over the array and write the "cleaned-up" values to your file. 您可以遍历数组,并将“清理后的”值写入文件。

BTW you should probably consider opening and writing to the output file in your main method: open it once and then write the lines, closing it only when you're done. 顺便说一句,您可能应该考虑使用main方法打开并写入输出文件:打开一次 ,然后编写各行,仅在完成后关闭它。 No need to re-open it for every line. 无需为每一行重新打开它。

我添加了对格式类型为“#,## 0.00_)红色的检查。仅当此单元格具有此格式时,我才进行Convert.ToString(range [NumRow,lngColCnt] .Value2)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM