繁体   English   中英

分页多个排序列表

[英]Pagination of multiple sorted lists

我有数量未知的有序列表,我需要对其进行分页。 例如,当页面大小为6时,这3个列表的页面应如下所示。

  • 清单1:01,02,03,04,05,06,07,08,09,10
  • 清单2:11、12、13、14、15
  • 清单3:16、17、18、19、20、21、22、23、24、25、26、27、28

结果页:

  • 第1页:01,11,16,02,12,17
  • 第2页:03,13,18,04,14,19
  • 第3页:05,15,20,06,21,07
  • 第4页:22,08,23,09,24,10
  • 第5页:25、26、27、28

给定页码后,从每个列表(起始索引和项目数)中获取哪些项目的最有效方法是什么?

考虑到每个列表可以有几十万个项目,因此遍历所有项目将没有效率。

我认为可以分两个步骤很好地完成此操作:

  1. 将您的列表平整为单个列表(按照您描述的方式排序)。
  2. 从列表中取出所需页面的项目。

要完成第1步,我将执行以下建议: 合并多个列表

因此,(假设您的页面项是ints,如您的示例所示),这是一个很好的方法,可以精确地找到您想要的内容:

    static IEnumerable<int> GetPageItems(IEnumerable<List<int>> itemLists, int pageSize, int page)
    {
        var mergedOrderedItems = itemLists.SelectMany(x => x.Select((s, index) => new { s, index }))
                                          .GroupBy(x => x.index)
                                          .SelectMany(x => x.Select(y => y.s));

        // assuming that the first page is page 1, not page 0:
        var startingIndex = pageSize * (page - 1);

        var pageItems = mergedOrderedItems.Skip(startingIndex)
                                          .Take(pageSize);
        return pageItems;            
    }

注意-您不必担心传入的页面数超过给定项目总数的情况下可能存在的页面总数...由于Linq的神奇之处,此方法将仅返回一个空IEnumerable。 同样,如果Take(pageSize)生成的项目少于“ pageSize”,则仅返回它确实找到的项目。

我不能说这是否是最有效的方法,但这是一种具有O(M * Log2(M))时间复杂度的算法,其中M列表的数量。 它的工作原理如下。 输入集按Count项按升序进行分组和排序,直到有效起始索引适合当前范围(跳过先前的范围),然后迭代Count 之所以可行,是因为在每一步我们都知道这是最小数量,因此所有其余列表中的项目都在该范围内。 完成后,我们从其余列表中发出页面项目。

这是函数:

static IEnumerable<T> GetPageItems<T>(List<List<T>> itemLists, int pageSize, int pageIndex)
{
    int start = pageIndex * pageSize;
    var counts = new int[itemLists.Count];
    for (int i = 0; i < counts.Length; i++)
        counts[i] = itemLists[i].Count;
    Array.Sort(counts);
    int listCount = counts.Length;
    int itemIndex = 0;
    for (int i = 0; i < counts.Length; i++)
    {
        int itemCount = counts[i];
        if (itemIndex < itemCount)
        {
            int rangeLength = listCount * (itemCount - itemIndex);
            if (start < rangeLength) break;
            start -= rangeLength;
            itemIndex = itemCount;
        }
        listCount--;
    }
    if (listCount > 0)
    {
        var listQueue = new List<T>[listCount];
        listCount = 0;
        foreach (var list in itemLists)
            if (itemIndex < list.Count) listQueue[listCount++] = list;
        itemIndex += start / listCount;
        int listIndex = 0;
        int skipCount = start % listCount;
        int nextCount = 0;
        int yieldCount = 0;
        while (true)
        {
            var list = listQueue[listIndex];
            if (skipCount > 0)
                skipCount--;
            else
            {
                yield return list[itemIndex];
                if (++yieldCount >= pageSize) break;
            }
            if (itemIndex + 1 < list.Count)
            {
                if (nextCount != listIndex)
                    listQueue[nextCount] = list;
                nextCount++;
            }
            if (++listIndex < listCount) continue;
            if (nextCount == 0) break;
            itemIndex++;
            listIndex = 0;
            listCount = nextCount;
            nextCount = 0;
        }
    }
}

并测试:

static void Main(string[] args)
{
    var data = new List<List<int>>
    {
        new List<int> { 01, 02, 03, 04, 05, 06, 07, 08, 09, 10 },
        new List<int> { 11, 12, 13, 14, 15 },
        new List<int> { 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 },
    };
    int totalCount = data.Sum(list => list.Count);
    int pageSize = 6;
    int pageCount = 1 + (totalCount - 1) / pageSize;
    for (int pageIndex = 0; pageIndex < pageCount; pageIndex++)
        Console.WriteLine("Page #{0}: {1}", pageIndex + 1, string.Join(", ", GetPageItems(data, pageSize, pageIndex)));
    Console.ReadLine();
}

我将根据Bear.S对第一个答案的反馈,提交另一个实现。 这个人的水平很低,表现也很出色。 它有两个主要部分:

  1. 找出哪个项目应该首先出现在页面上(特别是包含该项目的列表的索引是什么,以及该列表中该项目的索引是什么)。

  2. 根据需要,以正确的顺序从所有列表中取出物品(直到我们拥有所需的全部物品或物品用完)。

此实现不会在步骤1中迭代单个列表。它确实使用List.Count属性,但这是一个O(1)操作。

由于我们在这里要提高性能,因此代码不一定像我想要的那样具有自我描述性,因此我添加了一些注释来帮助解释逻辑:

    static IEnumerable<T> GetPageItems<T>(List<List<T>> itemLists, int pageSize, int page)
    {
        if (page < 1)
        {
            return new List<T>();
        }

        // a simple copy so that we don't change the original (the individual Lists inside are untouched):
        var lists = itemLists.ToList();

        // Let's find the starting indexes for the first item on this page:
        var currItemIndex = 0;
        var currListIndex = 0;
        var itemsToSkipCount = pageSize * (page - 1); // <-- assuming that the first page is page 1, not page 0

        // I'll just break out of this loop manually, because I think this configuration actually makes
        // the logic below a little easier to understand.  Feel free to change it however you see fit :)
        while (true)
        {
            var listsCount = lists.Count;
            if (listsCount == 0)
            {
                return new List<T>();
            }

            // Let's consider a horizontal section of items taken evenly from all lists (based on the length of
            // the shortest list).  We don't need to iterate any items in the lists;  Rather, we'll just count 
            // the total number of items we could get from this horizontal portion, and set our indexes accordingly...
            var shortestListCount = lists.Min(x => x.Count);
            var itemsWeAreConsideringCount = listsCount * (shortestListCount - currItemIndex);

            // Does this horizontal section contain at least as many items as we must skip?

            if (itemsWeAreConsideringCount >= itemsToSkipCount) 
            {   // Yes: So mathematically find the indexes of the first page item, and we're done.
                currItemIndex += itemsToSkipCount / listsCount;
                currListIndex = itemsToSkipCount % listsCount;
                break; 
            }
            else
            {   // No: So we need to keep going.  Let's increase currItemIndex to the end of this horizontal 
                // section, remove the shortest list(s), and the loop will continue with the remaining lists:
                currItemIndex = shortestListCount;
                lists.RemoveAll(x => x.Count == shortestListCount);
                itemsToSkipCount -= itemsWeAreConsideringCount;
            }
        }

        // Ok, we've got our starting indexes, and the remaining lists that still have items in the index range.
        // Let's get our items from those lists:
        var pageItems = new List<T>();
        var largestListCount = lists.Max(x => x.Count);

        // Loop until we have enough items to fill the page, or we run out of items:
        while (pageItems.Count < pageSize && currItemIndex < largestListCount)
        {
            // Taking from one list at a time:
            var currList = lists[currListIndex];

            // If the list has an element at this index, get it:
            if (currItemIndex < currList.Count)
            {
                pageItems.Add(currList[currItemIndex]);                    
            }
            // else... this list has no more elements.
            // We could throw away this list, since it's pointless to iterate over it any more, but that might 
            // change the indices of other lists...  for simplicity, I'm just gonna let it be... since the above 
            // logic simply ignores an empty list.

            currListIndex++;
            if (currListIndex == lists.Count)
            {
                currListIndex = 0;
                currItemIndex++;
            }
        }

        return pageItems;
    }

这是一些测试代码,使用三个列表。 我可以在几毫秒内从1,000,000页中抢走6个项目:)

        var list1 = Enumerable.Range(0, 10000000).ToList();
        var list2 = Enumerable.Range(10000000, 10000000).ToList();
        var list3 = Enumerable.Range(20000000, 10000000).ToList();
        var lists = new List<List<int>> { list1, list2, list3 };

        var timer = new Stopwatch();            
        timer.Start();

        var items = GetPageItems(lists, 6, 1000000).ToList();
        var count = items.Count();

        timer.Stop();

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM