简体   繁体   中英

C++ array vs C# ptr speed confusion

I am rewriting a high performance C++ application to C#. The C# app is noticeably slower than the C++ original. Profiling tells me that the C# app spends most time in accessing array elements. Hence I create a simple array access benchmark. I get completely different results than others doing a similiar comparison .

The C++ code:

#include <limits>
#include <stdio.h>
#include <chrono>
#include <iostream>

using namespace std;
using namespace std::chrono;

int main(void)
{
    high_resolution_clock::time_point t1 = high_resolution_clock::now();

    int xRepLen = 100 * 1000;
    int xRepCount = 1000;

    unsigned short * xArray = new unsigned short[xRepLen];
    for (int xIdx = 0; xIdx < xRepLen; xIdx++)
        xArray[xIdx] = xIdx % USHRT_MAX;

    int * xResults = new int[xRepLen];

    for (int xRepIdx = 0; xRepIdx < xRepCount; xRepIdx++)
    {

        // in each repetition, find the first value, that surpasses xArray[xIdx] + 25 - i.e. we will perform 25 searches
        for (int xIdx = 0; xIdx < xRepLen; xIdx++)
        {
            unsigned short xValToBreach = (xArray[xIdx] + 25) % USHRT_MAX;
            xResults[xIdx] = 0;

            for (int xIdx2 = xIdx + 1; xIdx2 < xRepLen; xIdx2++)
            if (xArray[xIdx2] >= xValToBreach)
            {
                xResults[xIdx] = xIdx2; break;
            }

            if (xResults[xIdx] == 0)
                xResults[xIdx] = INT_MAX;
        }
    }

    high_resolution_clock::time_point t2 = high_resolution_clock::now();
    auto duration = duration_cast<milliseconds>(t2 - t1).count();
    cout << "Elasped miliseconds " << duration;
    getchar();
}

The C# code:

using System;
using System.Collections.Generic;
using System.Diagnostics;

namespace arrayBenchmarkCs
{
    class Program
    {
        public static void benchCs()
        {
            unsafe
            {
                int xRepLen = 100 * 1000;
                int xRepCount = 1000;

                ushort[] xArr = new ushort[xRepLen];
                for (int xIdx = 0; xIdx < xRepLen; xIdx++)
                    xArr[xIdx] = (ushort)(xIdx % 0xffff);

                int[] xResults = new int[xRepLen];

                Stopwatch xSw = new Stopwatch(); xSw.Start();
                fixed (ushort * xArrayStart = & xArr [0])
                {
                    for (int xRepIdx = 0; xRepIdx < xRepCount; xRepIdx++)
                    {

                        // in each repetition, go find the first value, that surpasses xArray[xIdx] + 25 - i.e. we will perform 25 searches
                        ushort * xArrayEnd = xArrayStart + xRepLen;
                        for (ushort* xPtr = xArrayStart; xPtr != xArrayEnd; xPtr++)
                        {
                            ushort xValToBreach = (ushort)((*xPtr + 25) % 0xffff);
                            int xResult = -1;
                            for (ushort * xPtr2 = xPtr + 1; xPtr2 != xArrayEnd; xPtr2++)
                                if ( *xPtr2  >= xValToBreach)
                                {
                                    xResult = (int)(xPtr2 - xArrayStart);
                                    break;
                                }

                            if (xResult == -1)
                                xResult = int.MaxValue;

                            // save result
                            xResults[xPtr - xArrayStart] = xResult;
                        }
                    }
                }   // fixed

                xSw.Stop();

                Console.WriteLine("Elapsed miliseconds: " + (xSw.ElapsedMilliseconds.ToString("0"));
            }
        }

        static void Main(string[] args)
        {
            benchCs();
            Console.ReadKey();
        }
    }
}

On my work computer (i7-3770), the C++ version is approx 2x faster than the C# version. On my home computer (i7-5820K) the C++ is 1.5x faster than the C# version. Both are measured in Release. I hoped that by using pointers in C# I would avoid the array boundary checking and the performance would be the same in both languages.

So my questions are the following:

  • home come others are finding C# to be of the same speed as C++?
  • how can I get C# performance to the C++ level if not via pointers?
  • what could be the driver of different speedups on different computers?

Any hint is much appreciated, Daniel

You won't get this kind of hardcore number crunching to C++ speed. Using pointer arithmetic and unsafe code gets you some of the way there (it's almost half as slow again if you remove the unsafe and fixed parts). C# isn't compiled to native code, and the code that it is running is full of extra checks and stuff.

If you're willing to go unsafe then really there's nothing stopping you coding your C++ performance-critical stuff into a mixed-mode assembly, and calling that from your C# glue code.

C++ code is not working the same way as C#. The inner loop is different. There are 4 memory operations xResults[xIdx] and just 1 in c#.

I was shocked, that performance of C# code depends so much on framework version. What's even more interesting C# on .net core 3.1 outperformed C++ by 5%. With other frameworks I checked C# was 30-50% slower then C++

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM