简体   繁体   English

使用LINQ生成素数

[英]Using LINQ to generate prime numbers

Following is an interview question: 以下是面试问题:

The following one-liner generates and displays the list of first 500 prime numbers. 以下单行生成并显示前500个素数的列表。 How would you optimize it using parallel LINQ while still keeping it a SINGLE C# STATEMENT: 如何使用并行LINQ优化它,同时仍保持单个C#语句:

MessageBox.Show(string.Join(",", 
    Enumerable.Range(2, (int)(500 * (Math.Log(500) + Math.Log(System.Math.Log(500)) - 0.5)))
                .Where(x => Enumerable.Range(2, x - 2)
                                      .All(y => x % y != 0))
                .TakeWhile((n, index) => index < 500)));

I tried introducing AsParallel() as well as ParallelEnumerable into the query, but did not see any tangible benefits on multi-core machines. 我尝试将AsParallel()以及ParallelEnumerable引入查询,但没有看到多核机器的任何实际好处。 The query still uses one CPU core heavily while other cores enjoy leisure time. 查询仍然使用一个CPU核心,而其他核心享受休闲时间。 Can someone suggest an improvement that would distribute the load equally on all cores and thus reduce execution time? 有人可以提出一项改进措施,将负载平均分配到所有内核上,从而缩短执行时间吗?

For the enthusiast : The following formula returns an upper bound which is guaranteed to be greater than N prime numbers, ie if you check up to this number, you'll sure find N primes smaller than it: 对于发烧友 :以下公式返回一个上限,保证大于N个素数,即如果你检查这个数字,你肯定会发现小于它的N个素数:

UpperBound = N * (Log(N) + Log(Log(N)) - 0.5) //Log is natural log

This does it nicely on my machine. 这在我的机器上做得很好。 I've never actually seen all my cores go to 100% until now. 到目前为止,我从来没有真正看到我的所有内核都达到100%。 Thanks for giving me an excuse to play :) 谢谢你给我借口玩:)

I increased the numbers until I had a time slow enough to measure (20,000). 我增加了数字,直到我有足够的时间测量(20,000)。

The key option that made the difference to me was setting the ExecutionMode to ForceParallelism. 对我有所帮助的关键选项是将ExecutionMode设置为ForceParallelism。

Because I use a NotBuffered merge option, I re-sort it when I'm done. 因为我使用NotBuffered合并选项,所以当我完成时我会对它进行重新排序。 This would not be necessary if you don't care about the order of the results (perhaps you're putting the results in a HashSet). 如果您不关心结果的顺序(也许您将结果放在HashSet中),则不需要这样做。

The DegreeOfParallelism and MergeOptions only provided minor gains (if any) to performance on my machine. DegreeOfParallelism和MergeOptions仅为我的机器上的性能提供了微小的收益(如果有的话)。 This example shows how to use all the options in a single Linq statement, which was the original question. 此示例显示如何在单个Linq语句中使用所有选项,这是原始问题。

var numbers = Enumerable.Range(2, (int)(20000 * (Math.Log(20000) + Math.Log(System.Math.Log(20000)) - 0.5)))
                .AsParallel()
                .WithDegreeOfParallelism(Environment.ProcessorCount) 
                .WithExecutionMode(ParallelExecutionMode.ForceParallelism)
                .WithMergeOptions(ParallelMergeOptions.NotBuffered) // remove order dependancy
                .Where(x => Enumerable.Range(2, x - 2)
                                      .All(y => x % y != 0))
                .TakeWhile((n, index) => index < 20000);
string result = String.Join(",",numbers.OrderBy (n => n));

You can check only SQRT of value to do it (upgraded code from above) 你只能检查有价值的SQRT(从上面升级的代码)

var numbers = new[] {2, 3}.Union(Enumerable.Range(2, (int) (i*(Math.Log(i) + Math.Log(Math.Log(i)) - 0.5)))
                                           .AsParallel()
                                           .WithDegreeOfParallelism(Environment.ProcessorCount)
                                           // 8 cores on my machine
                                           .WithExecutionMode(ParallelExecutionMode.ForceParallelism)
                                           .WithMergeOptions(ParallelMergeOptions.NotBuffered)
                                           // remove order dependancy
                                           .Where(x => Enumerable.Range(2, (int) Math.Ceiling(Math.Sqrt(x)))
                                                                 .All(y => x%y != 0))
                                           .TakeWhile((n, index) => index < i))
                          .ToList();

but it's crazy when you have a simple and extremly fast alhoritm: 但是当你有一个简单而极其快速的讽刺时,它会很疯狂:

private static IEnumerable<int> GetPrimes(int k)
{
    int n = (int)Math.Ceiling((k * (Math.Log(k) + Math.Log(Math.Log(k)) - 0.5)));
    bool[] prime = new bool[n + 1];
    prime[0] = prime[1] = false;
    for (int i = 2; i < prime.Length; i++)
    {
        prime[i] = true;
    }
    for (int i = 2; i*i <= n; ++i) // valid for n < 46340^2 = 2147395600
        if (prime[i])
        {
            for (int j = i*i; j <= n; j += i)
                prime[j] = false;
            yield return i;
        }
}

of course it isn't as good as LINQ because is not fashionable way to solve a problem, but you should know that it exists. 当然它不如LINQ好,因为它不是解决问题的时髦方法,但你应该知道它存在。

Stopwatch t = new Stopwatch();
            t.Start();
            var numbers = Enumerable.Range(2, (int)(500 * (Math.Log(500) + Math.Log(System.Math.Log(500)) - 0.5)))
                .Where(x => Enumerable.Range(2, x - 2)
                                      .All(y => x % y != 0))
                .TakeWhile((n, index) => index < 500);
            t.Stop();
            MessageBox.Show(t.ElapsedMilliseconds.ToString());
            MessageBox.Show(string.Join(",", numbers));

It is evaluating in 3 millisecond. 它在3毫秒内进行评估。 Good linq query. 好的linq查询。

For future readers, this is what I ended up with. 对于未来的读者来说,这就是我最终的结果。 It is pretty fast. 它很快。 On my humble machine, it generates the list of first 20,000 prime numbers under a second. 在我不起眼的机器上,它会在一秒钟内生成前20,000个素数的列表。

Enumerable.Range(5, (int)(N * (Math.Log(N) + Math.Log(System.Math.Log(N)) - 0.5)))
            .AsParallel()
            .WithDegreeOfParallelism(Environment.ProcessorCount)
            .WithExecutionMode(ParallelExecutionMode.ForceParallelism)
            .WithMergeOptions(ParallelMergeOptions.NotBuffered) // remove order dependancy
            .Where(x => Enumerable.Range(2, (int)Math.Ceiling(Math.Sqrt(x)))
                                  .All(y => x % y != 0))
            .TakeWhile((n, index) => index < N).Concat(new int[] { 2, 3 }.AsParallel()).OrderBy(x => x).Take(N);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM