简体   繁体   English

C#有没有比PLINQ快的东西,它可以平均处理1000万个项目数组

[英]C# Is there anything faster than PLINQ for averaging 10million item array while grouing

I'm basically trying to find a way to come with the the fastest method that's available to produce an average of 10 million collection while grouping. 我基本上是在尝试找到一种方法,以最快的速度进行分组时平均产生1000万个馆藏。 Below is the code i'm using as a baseline but I cannot seem to find a way to make this any faster. 以下是我用作基准的代码,但我似乎找不到任何使它更快的方法。 I'm evaluting this based on a StopWatrch mainSW 我正在根据StopWatrch mainSW

Here is my Program.cs 这是我的Program.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace SpeedTest
{
    class Program
    {
        static void Main(string[] args)
        {
            global.BuildItems();
            System.Diagnostics.Stopwatch mainSW = new System.Diagnostics.Stopwatch();
            mainSW.Start();
            baseLineTest.plinqGroup();
            Console.WriteLine("Press Return");
            Console.ForegroundColor = ConsoleColor.Green;
            Console.WriteLine("Main SW Elapsed:" + mainSW.Elapsed);
            Console.ReadLine();
        }
    }
}

Here is my global.cs 这是我的global.cs

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace SpeedTest
{
    class global
    {
        //10 million
        public static long globalIteration = 10000000;

        private static string[] classOptions = { "AS", "CS", "LS", "PE", "WP", "LS" };
        public static Items[] items { get; set; }

        public static void BuildItems()
        {
            System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();

            Random r = new Random();
            Random r2 = new Random();

            Console.WriteLine("Building list");
            items = new Items[globalIteration];
            sw.Start();
            for (int i = 0; i < globalIteration; i++)
            {
                items[i] = new Items();
                items[i].cl = classOptions[r.Next(0, 5)];
                items[i].uc = Convert.ToDecimal(r.Next(300, 1000));
            }
            Console.WriteLine("Building list sw: " + sw.Elapsed);
        }

    }
    class Items
    {
        public decimal uc;
        public string cl;

    }
}

Here is my baseLineTest.cs 这是我的baseLineTest.cs

using System;
using System.Linq;
using System.Threading.Tasks;

namespace SpeedTest
{
    class baseLineTest
    {

        public static void plinqGroup()
        {

            System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
            Console.WriteLine("This test is a plinq over an array");


            decimal avg = 0;
            decimal sum = 0;
            string clas = "";
            sw.Start();
            var list = global.items.AsParallel().GroupBy(d => d.cl)
    .Select(
        g => new
        {
            Key = g.Key,
            Value = g.Average(s => s.uc)
        });

            foreach (var item in list)
                Console.WriteLine(string.Format("{0} : {1} ", item.Key, item.Value));

            Console.WriteLine("PLinq Group elapsed : " + sw.Elapsed);


        }

    }


}

This depends on how much you know about your data. 这取决于您对数据的了解程度。 If the "classOptions" are known to your code that consumes your data, then it is possible to get it around twice as fast (at least on my 4 core machine) by first creating an Array of result objects and then use Parallel.ForEach to manipulate the Array 如果使用数据的代码知道“ classOptions”,那么可以通过首先创建一个结果对象数组然后使用Parallel.ForEach来使其快两倍(至少在我的四核计算机上)。操作数组

    [TestMethod]
    public void MeasureParallelForEach()
    {
        string[] classOptions = { "AS", "CS", "LS", "PE", "WP", "LS" };
        global.BuildItems();
        System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
        Console.WriteLine("This test is a plinq over an array");

        sw.Start();
        IDictionary<string, Group> groups = global.classOptions.Distinct().ToDictionary(x => x, x => new Group(x));
        Parallel.ForEach(global.items, d => groups[d.cl].Add(d.uc));
        foreach (var item in groups)
            Console.WriteLine(string.Format("{0} : {1} ", item.Key, item.Value.Average));
        Console.WriteLine("Parallel.ForEach elapsed : " + sw.Elapsed);
    }

    public class Group
    {
        private int count;
        private decimal sum;
        public Group(string key)
        {
            Key = key;
        }

        public void Add(decimal d)
        {
            sum += d;
            count++;
        }

        public string Key { get; private set; }
        public decimal Average { get { return count==0?0m:sum/count; }}
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM