简体   繁体   English

3SUM - O(n^2 * log n) 比 O(n^2) 慢?

[英]3SUM - O(n^2 * log n) slower than O(n^2)?

In the scenario I present to you, my solution is supposed to represent O(n^2 * log n), and the "pointers" solution, which I assume is the fastest way to resolve the "3SUM" problem, represents O(n^2 * 1);在我呈现给您的场景中,我的解决方案应该代表 O(n^2 * log n),而我认为是解决“3SUM”问题的最快方法的“指针”解决方案代表 O(n ^2 * 1); leaving the question of is O(1) faster than O(log n), exampling it with my code.留下的问题是 O(1) 比 O(log n) 快,用我的代码举例。
Could someone explain why this seems to be the case?有人可以解释为什么会这样吗? Please.请。 My logic tells me that O(log n) should be as fast as O(1), if not faster.我的逻辑告诉我 O(log n) 应该和 O(1) 一样快,如果不是更快的话。
I hope my comments on the code of my solution are clear.我希望我对我的解决方案代码的评论是清楚的。

Edit: I know that this does not sound very smart... log(n) counts the input (n -> ∞), while 1... is just 1. BUT, in this case, for finding a number, how is it supposed to be faster to do sums and subtractions instead of using binary search (log n)?编辑:我知道这听起来不是很聪明...... log(n) 计算输入 (n -> ∞),而 1...... 只是 1。但是,在这种情况下,为了找到一个数字,如何做加法和减法而不是使用二进制搜索(log n)应该更快? It just does not enter my head.它只是没有进入我的脑海。


LeetCode 3SUM problem description LeetCode 3SUM 问题描述


O(n^2 * log n) O(n^2 * log n)

For an input of 3,000 values:对于 3,000 个值的输入:

  • Iterations: 1,722,085 (61% less than the "pointers solution")迭代次数: 1,722,085 (比“指针解决方案”少 61%)
  • Runtime: ~92 ms (270% slower than the typical O(n^2) solution)运行时间: ~92 毫秒(比典型的 O(n^2) 解决方案慢 270%)
public IList<IList<int>> MySolution(int[] nums)
{
    IList<IList<int>> triplets = new List<IList<int>>();

    Array.Sort(nums);

    for (int i = 0; i < nums.Length; i++)
    {
        // Avoid duplicating results.
        if (i > 0 && nums[i] == nums[i - 1])
            continue;

        for (int j = i+1; j < nums.Length - 1; j++)
        {
            // Avoid duplicating results.
            if (j > (i+1) && nums[j] == nums[j - 1])
                continue;

            // The solution for this triplet.
            int numK = -(nums[i] + nums[j]);

            // * This is the problem.
            // Search for 'k' index in the array.
            int kSearch = Array.BinarySearch(nums, j + 1, nums.Length - (j + 1), numK);

            // 'numK' exists in the array.
            if (kSearch > 0)
            {
                triplets.Add(new List<int>() { nums[i], nums[j], numK });
            }
            // 'numK' is too small, break this loop since its value is just going to increase.
            else if (~kSearch == (j + 1))
            {
                break;
            }
        }
    }

    return triplets;
}

O(n^2) O(n^2)

For the same input of 3,000 values:对于 3,000 个值的相同输入:

  • Iterations: 4.458.579迭代次数: 4.458.579
  • Runtime: ~34 ms运行时间: ~34 毫秒
public IList<IList<int>> PointersSolution(int[] nums)
{
    IList<IList<int>> triplets = new List<IList<int>>();

    Array.Sort(nums);

    for (int i = 0; i < nums.Length; i++)
    {
        if (i > 0 && nums[i] == nums[i - 1])
            continue;

        int l = i + 1, r = nums.Length - 1;

        while (l < r)
        {
            int sum = nums[i] + nums[l] + nums[r];

            if (sum < 0)
            {
                l++;
            }
            else if (sum > 0)
            {
                r--;
            }
            else
            {
                triplets.Add(new List<int>() { nums[i], nums[l], nums[r] });

                do
                {
                    l++;
                }
                while (l < r && nums[l] == nums[l - 1]);
            }
        }
    }

    return triplets;
}

It seems that your conceptual misunderstanding comes from the fact that you are missing that Array.BinarySearch does some iterations too (it was indicated by the initial iterations counts in the question which you now have changed).似乎您的概念误解来自这样一个事实,即您错过了Array.BinarySearch也进行了一些迭代(这由您现在已更改的问题中的初始迭代计数表明)。

So while assumption that binary search should be faster than simple iteration trough the collection is pretty valid - you are missing that binary search is basically an extra loop, so you should not compare those two but compare the second for loop + binary search in the first solution against the second loop of the second.因此,虽然假设二分搜索应该比通过集合的简单迭代更快是非常有效的 - 你错过了二分搜索基本上是一个额外的循环,所以你不应该比较这两者,而是比较第二个for循环+第一个二分搜索针对第二个循环的解决方案。

PS聚苯乙烯

To argue about time complexity based on runtimes with at least some degree of certainty you need at least to perform several tests with different increasing number of elements (like 100, 1000, 10000, 100000...) and see how the runtime changes.要至少在一定程度上确定基于运行时的时间复杂度,您至少需要使用不同数量的元素(如 100、1000、10000、100000 ...)执行多个测试,并查看运行时如何变化。 Also different inputs for the same number of elements are recommended cause in theory you can hit some optimal cases for one algorithm which can be the worst case scenarios for another.还建议对相同数量的元素使用不同的输入,因为理论上您可以为一种算法找到一些最佳情况,而对于另一种算法可能是最坏的情况。

Quick interjection--not sure your second solution ( pointers ) is O(n^2) --It has a third inner loop.快速感叹——不确定你的第二个解决方案( pointers )是O(n^2)它有第三个内部循环。 (See Stron's response below) (见下面 Stron 的回复)

I took a moment to profile you code with a generic .NET profiler and:我花了一点时间使用通用 .NET 分析器分析您的代码,并且:

在此处输入图像描述

That ought to do it, huh?应该这样做吧? ;) ;)

After checking the implementation, I found that BinarySearch internally uses CompareTo which I imagine isn't ideal (but, being a generic for an unmanaged type, it shouldn't be that bad...)检查实现后,我发现BinarySearch内部使用CompareTo ,我认为这并不理想(但是,作为非托管类型的泛型,它不应该那么糟糕......)

To "Improve" it, I dragged BinarySearch , kicking and screaming, and replaced the CompareTo with actual comparison operators.为了“改进”它,我拖着BinarySearch ,又踢又叫,并将CompareTo替换为实际的比较运算符。 I named this benchmark MyImproved Here's the results:我将此基准命名为MyImproved结果如下:

火焰图

Benchmark.NET results: Benchmark.NET 结果:

Interestingly, Benchmark.NET disregards common sense and puts MyImproved over Pointers .有趣的是,Benchmark.NET 无视常识并将MyImproved置于Pointers之上。 This may be due to some optimization which is turned off by the profiler.这可能是由于分析器关闭了一些优化。

Method方法 Complexity复杂 Mean意思 Error错误 StdDev标准偏差 Code Size代码大小
Pointers指针 O(n^2)??? O(n^2)??? 76.76 ms 76.76 毫秒 1.465 ms 1.465 毫秒 1.628 ms 1.628 毫秒 1,781 B 1,781 乙
My我的 O(n^2 * log n) O(n^2 * log n) 93.08 ms 93.08 毫秒 1.831 ms 1.831 毫秒 3.980 ms 3.980 毫秒 1,999 B 1,999 乙
MyImproved MyImproved O(n^2 * log n) O(n^2 * log n) 62.53 ms 62.53 毫秒 1.234 ms 1.234 毫秒 2.226 ms 2.226 毫秒 1,980 B 1,980 乙

TL;DR:长话短说:

.CompareTo() seemed to be bogging down the implementation of .BinarySearch() . .CompareTo()似乎阻碍了.BinarySearch()的实施。 Removing it and using actual integer comparison seemed to help a lot.删除它并使用实际的 integer 比较似乎有很大帮助。 Either that, or it's some funky interface stuff that I'm not prepared to investigate:)要么,要么是一些我不准备研究的时髦界面东西:)

Two tips:两个提示:

  1. Use sharplab.io to see your lowered code, it may reveal something ( link )使用 sharplab.io 查看您降低的代码,它可能会揭示一些东西( 链接

  2. try running these seperate tests through the do.netBenchmark nuget package, it'll give you more accurate timings, and if the memory usage or allocations is considerably higher in one case, that could be your answer.尝试通过 do.netBenchmark nuget package 运行这些单独的测试,它会给你更准确的计时,如果 memory 在一种情况下的使用或分配相当高,那可能就是你的答案。

Anyway, are you running these tests in debug or release mode?无论如何,您是在调试模式还是发布模式下运行这些测试? I just had a thought that I haven't tested recently, but I believe that the debugger overhead can significantly affect the performance of a binary search.我只是有一个想法,我最近没有测试过,但我相信调试器开销会显着影响二进制搜索的性能。

Give it a go, and let me know给它一个 go,让我知道

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM