简体   繁体   English

从重叠的日期范围列表中查找日期范围

[英]Find a Date range from a list of Date ranges where they overlap

I'm having a bit of trouble trying to process a list of objects which have simple From and To properties which are both DateTimes where I want the result to be a list of the same type of objects which show the ranges where there are overlaps, tbh, I think I've gone a bit code/logic blind now!我在尝试处理具有简单FromTo属性的对象列表时遇到了一些麻烦,这些对象都是DateTimes ,我希望结果是显示重叠范围的相同类型对象的列表, tbh,我想我现在有点代码/逻辑盲了!

For example (please note, dates are in ddMMyyyy format):例如(请注意,日期采用 ddMMyyyy 格式):

TS1: 01/01/2020 to 10/01/2020 
TS2: 08/01/2020 to 20/01/2020 

So in this case I would expect to get 2 objects, both containing the same data:所以在这种情况下,我希望得到 2 个对象,它们都包含相同的数据:

TSA: 08/01/2020 to 10/01/2020
TSB: 08/01/2020 to 10/01/2020

A more complex example:一个更复杂的例子:

TS1: 01/01/2020 to 10/01/2020 
TS2: 08/01/2020 to 20/01/2020 
TS3: 18/01/2020 to 22/01/2020 

So in this case I would expect to get 4 objects, two sets of two containing the same data:所以在这种情况下,我希望得到 4 个对象,两组包含相同数据的两组:

TSA: 08/01/2020 to 10/01/2020
TSB: 08/01/2020 to 10/01/2020
TSC: 18/01/2020 to 20/01/2020
TSD: 18/01/2020 to 20/01/2020

One more example:再举一个例子:

TS1: 01/01/2020 to 01/10/2020 
TS2: 01/02/2020 to 01/09/2020 
TS3: 01/03/2020 to 01/04/2020 

So in this case I would expect to get 3 objects, all containing the same data:所以在这种情况下,我希望得到 3 个对象,所有对象都包含相同的数据:

TSA: 01/03/2020 to 01/04/2020
TSB: 01/03/2020 to 01/04/2020
TSC: 01/03/2020 to 01/04/2020

I've tried researching an algorithm online, but without any luck to get exactly what I want, or they are SQl based answers.我试过在线研究算法,但没有任何运气得到我想要的,或者它们是基于 SQl 的答案。

Any suggestions would be very welcome.任何建议将非常受欢迎。

Edit: Just to explain what this is going to be used for so it might make it a bit clearer for some of the commenters below.编辑:只是为了解释这将用于什么,所以它可能会让下面的一些评论者更清楚一些。 Each of these date ranges denote a room which in use.这些日期范围中的每一个都表示正在使用的房间。 This system is meant to report back date ranges when there are no rooms available at all.该系统旨在在根本没有可用房间时报告回溯日期范围。 As I already know the quantity of rooms I can determine if there is any availability from these results and return the no availability date ranges.因为我已经知道房间的数量,所以我可以从这些结果中确定是否有空房,并返回没有空房的日期范围。

I've also edited the expected results after trying some of the answers below在尝试了以下一些答案后,我还编辑了预期结果

The following algorithm calculates the result in O(n log(n)) in the common case, although it is still O(n^2) in the worst case.下面的算法在普通情况下以 O(n log(n)) 计算结果,尽管在最坏情况下它仍然是 O(n^2)。

First, a record class.首先,记录类。

public class DateRange
{
    public DateRange(DateTime from, DateTime to)
    {
        From = from;
        To = to;
    }

    public DateTime From { get; set; }
    public DateTime To { get; set; }
}

My algorithm is as follows.我的算法如下。 I added some comments to the algorithm, so I hope it is comprehensible.我对算法添加了一些注释,所以我希望它是可以理解的。 In principle, it exploits the fact that most ranges do (hopefully) not overlap with more than a few other ranges, by processing the input in sorted order, dropping older input entries from consideration once the current input has moved past their end time.原则上,它利用了大多数范围(希望)不会与多个其他范围重叠的事实,通过按排序顺序处理输入,一旦当前输入超过其结束时间,就将较旧的输入条目从考虑中排除。

public static IEnumerable<DateRange> FindOverlaps(IList<DateRange> dateRanges)
{
    if (dateRanges.Count < 2)
    {
        return Enumerable.Empty<DateRange>();
    }

    // Sort the input ranges by start time, in ascending order, to process them in that order.
    var orderedRanges = dateRanges.OrderBy(x => x.From).ToList();
    // Keep a list of previously processed values.
    var previousRanges = new List<DateRange>
    {
        orderedRanges.First(),
    };

    var result = new List<DateRange>();
    foreach (var value in orderedRanges.Skip(1))
    {
        var toDelete = new List<DateRange>();
        // Go through all ranges that start before the current one, and pick those among
        // them that end after the current one starts as result values, and also, delete all
        // those that end before the current one starts from the list -- given that the input
        // is sorted, they will never overlap with future input values.
        foreach (var dateRange in previousRanges)
        {
            if (value.From >= dateRange.To)
            {
                toDelete.Add(dateRange);
            }
            else
            {
                result.Add(new DateRange(value.From, value.To < dateRange.To ? value.To : dateRange.To));
            }
        }
        foreach (var candidate in toDelete)
        {
            previousRanges.Remove(candidate);
        }
        previousRanges.Add(value);
    }

    return result;
}

Note that it is possible that all the n values in the input overlap.请注意,输入中的所有n值都可能重叠。 In this case, there are n*(n-1) overlaps, so the algorithm will necessarily run in O(n^2).在这种情况下,有n*(n-1)重叠,因此算法必然会在 O(n^2) 中运行。 However, in the well-formed case where each date range has a low number of overlaps with other date ranges, the complexity will be roughly O(n log(n)), with the expensive operation being the .OrderBy() calls on the input.然而,在格式良好的情况下,每个日期范围与其他日期范围的重叠次数很少,复杂性将大致为 O(n log(n)),代价高昂的操作是 .OrderBy() 调用输入。

One more consideration.还有一个考虑。 Consider you have a list of input values like so:考虑您有一个输入值列表,如下所示:

var example = new[]
{
    new DateRange(new DateTime(2000, 1, 1), new DateTime(2010, 1, 10)),
    new DateRange(new DateTime(2000, 2, 1), new DateTime(2000, 10, 10)),
    new DateRange(new DateTime(2000, 3, 11), new DateTime(2000, 9, 12)),
    new DateRange(new DateTime(2000, 4, 11), new DateTime(2000, 8, 12)),
};

In this case, not only do all the values overlap, they are also contained within one another.在这种情况下,不仅所有的值都重叠,而且它们还相互包含。 My algorithm as posted above will report such regions multiple times (for example, it will return the range from 2000-04-11 to 2000-08-12 three times, because it overlaps three other date ranges).我上面发布的算法将多次报告此类区域(例如,它将返回 2000-04-11 到 2000-08-12 的范围三次,因为它与其他三个日期范围重叠)。 In case you don't want overlapping regions to be reported multiple times like that, you can feed the output of the above function to the following function to filter them down:如果您不希望像这样多次报告重叠区域,您可以将上述函数的输出提供给以下函数以过滤它们:

public static IEnumerable<DateRange> MergeRanges(IList<DateRange> dateRanges)
{
    var currentOverlap = dateRanges.First();
    var r = new List<DateRange>();
    foreach (var dateRange in dateRanges.Skip(1))
    {
        if (dateRange.From > currentOverlap.To)
        {
            r.Add(currentOverlap);
            currentOverlap = dateRange;
        }
        else
        {
            currentOverlap.To = currentOverlap.To > dateRange.To ? currentOverlap.To : dateRange.To;
        }
    }
    r.Add(currentOverlap);
    return r;
}

This does not affect overall algorithmic complexity, as it's obviously O(n)-ish.这不会影响整体算法的复杂性,因为它显然是 O(n)-ish。

Let's assume you defined a type to store the date ranges like this:假设您定义了一个类型来存储这样的日期范围:

public class DataObject
{
    public DateTime From { get; set; }
    public DateTime To { get; set; }
}

Then you can compare the items in your list to each other to determine if they overlap, and if so return the overlapping period of time (just to point you in the right direction, I did not thoroughly test this algorithm)然后您可以将列表中的项目相互比较以确定它们是否重叠,如果是,则返回重叠的时间段(只是为了指出正确的方向,我没有彻底测试此算法)

public DataObject[] GetOverlaps(DataObject[] objects)
{
    var result = new List<DataObject>();

    if (objects.Length > 1)
    {
        for (var i = 0; i < objects.Length - 1; i++)
        {
            var pivot = objects[i];

            for (var j = i + 1; j < objects.Length; j++)
            {
                var other = objects[j];

                // Do both ranges overlap?
                if (pivot.From > other.To || pivot.To < other.From)
                {
                    // No
                    continue;
                }

                result.Add(new DataObject
                {
                    From = pivot.From >= other.From ? pivot.From : other.From,
                    To = pivot.To <= other.To ? pivot.To : other.To,
                });
            }
        }
    }

    return result.ToArray();
}

Some of the requirements around nested ranges and other corner cases are not exactly clear.嵌套范围和其他极端情况的一些要求并不完全清楚。 But here's another algorithm that seems to do what you're after.但这是另一种算法,似乎可以满足您的要求。 The usual caveats of limited testing apply of course - could not work at all for corner cases I didn't test.有限测试的通常警告当然适用 - 对于我没有测试的极端情况根本无法使用。

The algorithm relies on sorting the data first.该算法首先依赖于对数据进行排序。 If you can't do that then this won't work.如果你不能这样做,那么这将不起作用。 The algorithm itself is as follows:算法本身如下:

    static IEnumerable<DateRange> ReduceToOverlapping(IEnumerable<DateRange> source)
    {
        if (!source.Any())
            yield break;

        Stack<DateRange> stack = new Stack<DateRange>();

        foreach (var r in source)
        {
            while (stack.Count > 0 && r.Start > stack.Peek().End)
                stack.Pop();

            foreach (var left in stack)
            {
                if (left.GetOverlap(r) is DateRange overlap)
                    yield return overlap;
            }

            stack.Push(r);
        }
    }

The DateRange is a simple class to hold the dates you presented. DateRange 是一个简单的类,用于保存您显示的日期。 It looks like this:它看起来像这样:

class DateRange 
{
    public DateRange(DateRange other)
    { this.Start = other.Start; this.End = other.End; }

    public DateRange(DateTime start, DateTime end)
    { this.Start = start; this.End = end; }

    public DateRange(string start, string end)
    {
        const string format = "dd/MM/yyyy";

        this.Start = DateTime.ParseExact(start, format, CultureInfo.InvariantCulture);
        this.End = DateTime.ParseExact(end, format, CultureInfo.InvariantCulture);
    }

    public DateTime Start { get; set; }
    public DateTime End { get; set; }

    public DateRange GetOverlap(DateRange next)
    {
        if (this.Start <= next.Start && this.End >= next.Start)
        {
            return new DateRange(next.Start, this.End < next.End ? this.End : next.End);
        }

        return null;
    }
}

As I mentioned this is used by sorting it first.正如我提到的,这是通过首先对其进行排序来使用的。 Example of sorting and calling the method on some test data is here:对一些测试数据进行排序和调用方法的示例如下:

    static void Main(string[] _)
    {
        foreach (var (inputdata, expected) in TestData)
        {
            var sorted = inputdata.OrderBy(x => x.Start).ThenBy(x => x.End);
            var reduced = ReduceToOverlapping(sorted).ToArray();

            if (!Enumerable.SequenceEqual(reduced, expected, new CompareDateRange()))
                throw new ArgumentException("failed to produce correct result");
        }

        Console.WriteLine("all results correct");
    }

To Test you'll need a equality comparer and the test data which is here:要进行测试,您需要一个相等比较器和此处的测试数据:

class CompareDateRange : IEqualityComparer<DateRange>
{
    public bool Equals([AllowNull] DateRange x, [AllowNull] DateRange y)
    {
        if (null == x && null == y)
            return true;

        if (null == x || null == y)
            return false;

        return x.Start == y.Start && x.End == y.End;
    }

    public int GetHashCode([DisallowNull] DateRange obj)
    {
        return obj.Start.GetHashCode() ^ obj.End.GetHashCode();
    }
}

    public static (DateRange[], DateRange[])[] TestData = new (DateRange[], DateRange[])[]
    {
        (new DateRange[]
            {
                new DateRange("01/01/2020", "18/01/2020"),
                new DateRange("08/01/2020", "17/01/2020"),
                new DateRange("09/01/2020", "15/01/2020"),
                new DateRange("14/01/2020", "20/01/2020"),
            },
         new DateRange[]
            {
                new DateRange("08/01/2020", "17/01/2020"),
                new DateRange("09/01/2020", "15/01/2020"),
                new DateRange("09/01/2020", "15/01/2020"),
                new DateRange("14/01/2020", "15/01/2020"),
                new DateRange("14/01/2020", "17/01/2020"),
                new DateRange("14/01/2020", "18/01/2020"),
            }),
        (new DateRange[]
            {
                new DateRange("01/01/2020", "10/01/2020"),
                new DateRange("08/01/2020", "20/01/2020"),
            },
         new DateRange[]
            {
                new DateRange("08/01/2020", "10/01/2020"),
            }),
        (new DateRange[]
            {
                new DateRange("01/01/2020", "10/01/2020"),
                new DateRange("08/01/2020", "20/01/2020"),
                new DateRange("18/01/2020", "22/01/2020"),
            },
         new DateRange[]
            {
                new DateRange("08/01/2020", "10/01/2020"),
                new DateRange("18/01/2020", "20/01/2020"),
            }),
        (new DateRange[]
            {
                new DateRange("01/01/2020", "18/01/2020"),
                new DateRange("08/01/2020", "10/01/2020"),
                new DateRange("18/01/2020", "22/01/2020"),
            },
         new DateRange[]
            {
                new DateRange("08/01/2020", "10/01/2020"),
                new DateRange("18/01/2020", "18/01/2020"),
            }),
    };

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM