简体   繁体   中英

Optimise performance of an odbc data reader

So for a project, I need to read data from Acomba odbc driver. Acomba is an old accounting software. Behind the scene, acomba stores his data in hexa in flat files. They provide an odbc driver that works decently, but it is very slow.

For a particular client, they have 17000 products. Nothing too big. But getting the 17000 products from the odbc driver takes over 2m30. Some clients have close to 1 million products, so the performance becomes a huge issue.

Basically, this code is in a webapi. It gets the data, make a csv with it and return the csv file in the httpResponse. The csv generation takes more or less 1 second. So that is not the issue.

I tried a few things :

        using (var db = new OdbcConnection($"DSN={_settings.DsnName}"))
        {
           await db.OpenAsync();
           OdbcCommand comm = new OdbcCommand(sql, db);
           OdbcDataReader dr = comm.ExecuteReader();
           while (dr.Read())
           {
              var col1 = dr.GetValue(0).ToString();
              var col2 = dr.GetValue(1).ToString();
              var col3 = dr.GetValue(2).ToString();
           }
        }

Same thing, but with

dr.GetValues(destinationArray)

Same thing, but with an odbc adapter

           var dt = new DataTable();
           using (var cmd = new OdbcCommand(sql, db))
           using (var adapter = new OdbcDataAdapter(cmd))
           {
              adapter.Fill(dt);
           }

Those all end up taking between 2min and 2min20.

The issue with this particular table is that there is 150 columns. Processing all those 150 columns is what takes all the time.

Tried to search for a way to optimize this, but at the end of the day, all the code I found is basically the same thing as what I wrote.

Here is a twist!

If I open the odbc connection in Excel or Access, both of them can build a table and display it in under 5 seconds. I scrolled all the way down, so the data is actually loaded. It's not only showing the first 50 rows.

Does anybody knows how to get the same kind of performance in C#?

Thanks for your time!

I found the way to make the ODBC connection go MUCH faster. Basically, this old odbc driver is a COM. It turns out that COM objects goes much faster when you are in Single Threaded apartment vs multi-threaded. So I used the solution posted in this article : http://ryanhaugh.com/archive/2014/05/24/supporting-sta-threads-in-web-api/

public static class TaskFactoryExtensions
{
    private static readonly TaskScheduler _staScheduler = new StaTaskScheduler(numberOfThreads: 1);

    public static Task<TResult> StartNewSta<TResult>(this TaskFactory factory, Func<TResult> action)
    {
        return factory.StartNew(action, CancellationToken.None, TaskCreationOptions.None, _staScheduler);
    }
}

And im my api call, I do this :

Task<DataTable> responseTask = Task.Factory.StartNewSta(() => GetData(sqlQuery));

The actual get data, which is identical to the old one :

  private DataTable GetData(string value)
  {
     var dt = new DataTable();

     using (var db = new OdbcConnection($"DSN={_settings.DsnName}"))
     {
        db.Open();
        OdbcCommand comm = new OdbcCommand(value, db);
        using (OdbcDataAdapter da = new OdbcDataAdapter(comm))
        {
           da.Fill(dt);
        }
     }

     return dt;
  }

I use the extension method on all the calls made to get the data using StartNewSta. With the exact same code, it went from 2min 30 to 9 seconds .

So the key for fast COM objects, is STA threads!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM