简体   繁体   中英

Linq System.OutofMemoryException

I have a long running process in C# that hits a Sql table anywhere from 10 to 200 times. When the process exceeds about 50 hits and queries greater than about 100,000 rows from the same table each time, it will throw a System Out of Memory Exception at this line, specifically at the bottom where it converts the IQuery object to a List:

var cht = from p in _db.TickerData
          where p.Time >= Convert.ToDateTime(start) &&
          p.Time <= Convert.ToDateTime(end)
          orderby p.Time
          select p;

_prices = cht.ToList();    < this is where the System.OutofMemoryException occurs >

What can I do to prevent this error??

The data you trying to retrieve is too huge for your list. The exception comes on ToList() because it's exactly the place where query is executing. What do you want to achieve with such large list? Possible solutions are:

1) Restrict your search with more criteria. Load not the whole data, but a part of it, and another part if you really need it.

2) Use another data structure than a list if you want to load whole data in memory, take a look on ConcurrentDictionary

First off:

specifically at the bottom where it converts the IQuery object to a List

Yes, that's where you'd expect the Out Of Memory condition to occur.

The assignment of cht above doesn't actually hit the database; all it does is declare the shape of the query. This is called deferred execution and LINQ uses it all over the place. It means "we don't actually process anything until your code needs it."

Calling ToList , though, essentially says "the code needs it, all of it, right now." So that's where it sends the query to the database, pulls back all the results all at once, uses LINQ magic to turn them into CLR objects, and stuff them all in a List<T> for you.

Having said that, this is just a hunch, but it's possible that your LINQ provider doesn't know what Convert.ToDateTime is. If it doesn't know how to handle that, it won't put it into the WHERE clause in the query it executes, and instead it will load the entire table and filter it client-side, which might be why you crash when the table gets too big, rather than when the result set gets too big.

To verify this, use a profiler for your database to intercept the query, and see if the WHERE clause looks like you'd expect. If it's not translating right, try this instead:

var startTime = Convert.ToDateTime(start);
var endTime = Convert.ToDateTime(end);
var cht = from p in _db.TickerData
          where p.Time >= startTime && p.Time <= endTime
          orderby p.Time
          select p;
_prices = cht.ToList();

If that doesn't help, you're probably just pulling back too much data, and you'll have to work on that the same ways you'd work with processing too much data in any other context.

Your problem is that the query returns a very large set of data which needs to be stored in our process's memory. Very much data => OutOfMemoryException. That's normal. What's not normal is trying to do such a thing. Instead, you could limit the result set with some extra filtering or break the large result set into smaller ones, maybe like so:

        DateTime startDateTime = Convert.ToDateTime(start);
        DateTime endDateTime = Convert.ToDateTime(end);
        int fetched = 0;
        int totalFetched = 0;

        do
        {
            //fetch a batch of 1000 records
            var _prices = _db.TickerData.Where(p => p.Time >= startDateTime && p.Time <= endDateTime)
                                    .OrderBy(p => p.Time)
                                    .Skip(totalFetched)
                                    .Take(1000)
                                    .ToList();                

            //do here whatever you want with your batch of 1000 records
            fetched = _prices.Count;
            totalFetched += fetched;
        }
        while (fetched > 0);

This way you can process any amount of data, in batches.

EDIT: fixed some issues, as reported by @Code.me in the comments section.

EDIT: I suggest you set up an index at the database level on the Time column, if you haven't already, to speed up these queries.

Because of deferred execution, the query will execute when you call ToList() on it. Since loading all the data will consume too much memory, its a good practice to batch process.

The below code will let you fetch 1000 (or whatever works best for you) records at time and you can process them.

var startTime = Convert.ToDateTime(start);
var endTime = Convert.ToDateTime(end);

IEnumerable<T> prices= new List<T>();  // whatever T is

var currentFetched = 0;
var totalFetched = 0;

do
{
    var cht = _db.TickerData.Where(p => p.Time >= startTime && p.Time << endTime)
                        .OrderBy(p => p.Time)
                        .Skip(totalFetched)
                        .Take(1000)
                        .ToList();

    currentFetched = cht.Count();
    totalFetched += currentFetched;

    // prices = prices.Concat(cht).ToList();  
    // ^ This might throw an exception later when the list becomes too big 
    // So you can probably process currently fetched data
}
while (currentFetched > 0);

What I did with mine I just let it return an IQueryable object. It's still a list but it performs a lot better.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM