简体   繁体   中英

How to read an Excel spreadsheet in c# quickly

I am using Microsoft.Office.Interop.Excel to read a spreadsheet that is open in memory.

gXlWs = (Microsoft.Office.Interop.Excel.Worksheet)gXlApp.ActiveWorkbook.ActiveSheet;
int NumCols = 7;
string[] Fields = new string[NumCols];
string input = null;
int NumRow = 2;
while (Convert.ToString(((Microsoft.Office.Interop.Excel.Range)gXlWs.Cells[NumRow, 1]).Value2) != null)
{
    for (int c = 1; c <= NumCols; c++)
    {
        Fields[c-1] = Convert.ToString(((Microsoft.Office.Interop.Excel.Range)gXlWs.Cells[NumRow, c]).Value2);
    }
    NumRow++;

    //Do my other processing
}

I have 180,000 rows and this turns out be very slow. I am not sure the "Convert" is efficient. Is there anyway I could do this faster?

Moon

Hi I found a very much faster way.

It is better to read the entire data in one go using "get_range". This loads the data into memory and I can loop through that like a normal array.

Microsoft.Office.Interop.Excel.Range range = gXlWs.get_Range("A1", "F188000");
object[,] values = (object[,])range.Value2;
int NumRow=1;
while (NumRow < values.GetLength(0))
{
    for (int c = 1; c <= NumCols; c++)
    {
        Fields[c - 1] = Convert.ToString(values[NumRow, c]);
    }
    NumRow++;
}

There are several options - all involve some additional library:

  • OpenXML 2.0 (free library from MS) can be used to read/modify the content of an .xlsx so you can do with it what you want

  • some (commercial) 3rd-party libraries come with grid controls allowing you to do much more with excel files in your application (be it Winforms/WPF/ASP.NET...) like SpreadsheetGear , Aspose.Cells etc.

I am not sure the "Convert" is efficient. Is there anyway I could do this faster?

What makes you believe this? I promise you that Convert.ToString() is the most effective method in the code you posted. Your problem is that your looping through 180,000 records in an excel document...

You could split the work up since you know the number of row this is trival to do.

Why are you coverting Value2 to a string exactly?

Use the OleDB Method. That is the fastest as follows;

string con =
  @"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=D:\temp\test.xls;" + 
  @"Extended Properties='Excel 8.0;HDR=Yes;'";    
using(OleDbConnection connection = new OleDbConnection(con))
{
    connection.Open();
    OleDbCommand command = new OleDbCommand("select * from [Sheet1$]", connection); 
    using(OleDbDataReader dr = command.ExecuteReader())
    {
         while(dr.Read())
         {
             var row1Col0 = dr[0];
             Console.WriteLine(row1Col0);
         }
    }
}

I guess it's not the Convert the source of "slowing"...

Actually, retrieving cell values is very slow.

I think this conversion is not necessary:

(Microsoft.Office.Interop.Excel.Range)gXlWs

It should work without that.

And you can ask directly:

gXlWs.Cells[NumRow, 1].Value != null

Try to move the entire range or, at least, the entire row to an object Matrix and work with it instead of the range itself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM