I've got a data table with with two collumns: a number and a date, for example :
125 | 2013/10/20
100 | 2013/10/21
150 | 2013/10/24
225 | 2013/10/24
250 | 2013/10/28
310 | 2013/10/30
now, I want to search for all records ordered by date where the sum of the number matches 500. I can easily see that the first, third and fourth record (125 + 150 + 225 = 500) provide a match , but to program such, I can only think of going through the data table a zillion of times until I found the right match.
Has anybody a smarter idea?
In the worst case, you do have to go through all 2^n
subsets of your dataset, but if all of your items are non-negative you can start by filtering on item.Number <= 500
.
Here is a possible Subsets
method (actually an answer to How to get all subsets of an array? , but don't tell them):
public static IEnumerable<IEnumerable<T>> Subsets(this IEnumerable<T> source)
{
var first = source.FirstOrDefault();
if (first == null) return new[] { Enumerable.Empty<T>() };
var others = source.Skip(1).Subsets();
return others.Concat(others.Select(s => s.Concat(new { first })));
}
Once you have your Subsets
method, then you can filter the result as follows, though performance is still of the order of a gazillion (or 2^n
if you want to be picky).
var sets = items.Where(i => i.Number <= 500)
.Subsets().Where(s => s.Sum(i => i.Number) == 500);
However, if you do have the constraint on Number
, that it is non-negative, you can combine the Subsets
operation with a search for a target sum. That would mean you would define
public static IEnumerable<IEnumerable<T>> SubsetsAddingUpTo(this IEnumerable<T> source, int target)
{
// This stopping condition ensures that you will not have to walk the rest of the tree when you have already hit or exceeded your target.
// It assumes that the Number values are all non-negative.
if (target < 0) return Enumerable.Empty<IEnumerable<T>>();
var first = source.FirstOrDefault();
if (first == null) return Enumerable.Empty<IEnumerable<T>>();
var tail = source.Skip(1).Where(i => i.Number <= target).ToList();
var othersIncludingFirst = tail.SubsetsAddingUpTo(target - first.Number);
var othersExcludingFirst = tail.SubsetsAddingUpTo(target);
return othersExcludingFirst.Concat(othersIncludingFirst.Select(s => s.Concat(new { first })));
}
Because the check for <= target
happens inside the method, you don't have to do any pre-filtering. However, you can perform a sort before you do the search, to give you your sets in a hierarchical date order. The call will be
var sets = items.OrderByDescending(i => i.Date).SubsetsAddingUpTo(500);
This should actually give you decent performance. The worst case (every item has a number of 0 or 1) won't be very good (order 2^n
), but if most of the values of Number
are of a similar order of magnitude to your target sum, as is the case in your example, then the stopping condition will fly in to your rescue and save you a large number of unnecessary operations.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.