简体   繁体   中英

How do I speed up C# numeric code

I have a C# application that calls a Fortran DLL to do numeric calculations. It has been determined that the DLL call is at least 10x slower than calling the same calculation from a Fortran console app, so I have started porting the code to C#. The porting process is to copy code and fix the syntax line by line. So the C# and Fortran look very similiar. Common blocks in the Fortran become fields in the class. After porting a few of the core routines, I started testing and found the double precision C# code is 30x slower than the double precision Fortran (50x slower than the single precision Fortran). I loop over my test code 100x to minimize the contribution from the overhead of the C# JIT compiler.

The code uses complex arithmetic and complex functions such as SQRT and LOG10. I have supplied a C# struct to handle that math. I suspect this is where the problem lies, but there does not seem to be a profiler in VS2010 Pro, so I don't know for sure.

The Fortran compile is a recent version from Intel. I have not done any special optimizations to either the C# or Fortran code. I have compared time using release versions. I only have a two core CPU so parallization probably wont help much.

I could use some suggestions on how to speed up this code.

Here is one of my methods. The C# looks just like the Fortran, which I did not write.

    public void Forwx(double rshldr, double rbed, double[][] resdep, double[]toolparm)
    {
        var zex = new Complex[70];
        var zey = new Complex[70];

        var px2 = new Complex[70];
        var px4 = new Complex[70];

        var rlt = new double[2];
        var trsmt = new double[2];
        var fr = new double[2];
        var dnc = -0.02;
        var factr = 26332.65;

        var rh2 = Math.Max(0.1, rbed);
        var rh1 = Math.Max(0.1, rshldr);
        const double e1 = 1.0;
        const double e2 = 1.0;
        const double er = 0.818730753077982;
        const double re = 1.0 / er;
        var ii = Complex.I;
        const double pi = Math.PI;
        const double eps0 = 8.854e-12;
        const double amu0 = 4.0e-7 * pi;

        for (var ktool = 3; ktool <= 6; ktool++)
        {
            if (ktool == 3)            // Integrated 2MHz
            {
                dnc = -0.02;
                rlt[0] = 0.2794;
                rlt[1] = -0.2794;
                trsmt[0] = 0.904875;
                trsmt[1] = -0.904875;
                fr[0] = 2000.0;
                factr = 26332.65;
            }

            if (ktool == 4)         // Integrated 400kHz
            {
                dnc = -0.02;
                rlt[0] = 0.2794;
                rlt[1] = -0.2794;
                trsmt[0] = 0.904875;
                trsmt[1] = -0.904875;
                fr[0] = 400.0;
                factr = 26811.866;
            }

            if (ktool == 5)           // Option 5 20kHz
            {
                dnc = -0.1;
                rlt[0] = 0.0;
                rlt[1] = 0.0;
                trsmt[0] = 5.75;
                trsmt[1] = 5.75;
                fr[0] = 20.0;
                factr = 26811.866 * 2.516 * toolparm[1] / 0.28e8;
            }

            if (ktool == 6)         // Option 6 50kHz
            {
                dnc = -0.1;
                rlt[0] = 0.0;
                rlt[1] = 0.0;
                trsmt[0] = 5.75;
                trsmt[1] = 5.75;
                fr[0] = 50.0;
                factr = 26811.866 * 6.291 * toolparm[2] / 0.7e8;
            }

            var r1 = trsmt[0] - rlt[0];
            var r2 = trsmt[0] - rlt[1];
            var omega = 2000.0 * pi * fr[0];
            var k12 = omega*amu0*(omega*e1*eps0 + ii/rh1);
            var k22 = omega*amu0*(omega*e2*eps0 + ii/rh2);
            var krat = (k22 - k12)/k12;

            for (var iz = 0; iz < 601; iz++)
            {
                var recx1 = new Complex(0.0, 0.0);
                var rx1 = new Complex(0.0, 0.0);
                var recy1 = new Complex(0.0, 0.0);
                var ry1 = new Complex(0.0, 0.0);
                var lam = new Complex(3.01517934056e-04 / (Math.Pow(er, 5) * r1));
                Complex c1;
                Complex c2;
                for (var i = 0; i < 70; i++)
                {
                    if (iz == 0)
                    {
                        lam = lam * re;
                        var lam2 = lam * lam;
                        var p11 = lam2 - k12;
                        var p1 = Complex.Sqrt(p11);
                        var p22 = lam2 - k22;
                        var p2 = Complex.Sqrt(p22);
                        zex[i] = Complex.Exp(dnc * p2);
                        zey[i] = Complex.Exp(dnc * p1);
                        c1 = p2 * k12;
                        c2 = p1 * k22;
                        var t3 = lam / p2;
                        var t2 = t3 * (c1 - c2) / (c1 + c2);
                        var q2 = lam * krat * (t2 + t3) / (p1 + p2);
                        px2[i] = (lam2 * q2 + lam * p2 * t2);
                        px4[i] = px2[i];
                    }
                    else
                    {
                        px2[i] = px2[i] * zex[i];
                        px4[i] = px4[i] * zey[i];
                    }
                    recx1 = recx1 + a1[i] * px2[i];
                    recy1 = recy1 + a1[i] * px4[i];
                    rx1 = rx1 + px2[i] * as1i[i];
                    ry1 = ry1 + px4[i] * as1i[i];
                }
                if (ktool <= 4)
                {
                    c1 = recx1*r1;
                    c2 = rx1*r2;
                    c2 = c2 - Math.Pow(r1/r2,3)*c1;
                    resdep[12 - ktool][iz + 600] = c2.Re*factr;
                    c1 = recy1*r1;
                    c2 = ry1*r2;
                    c2 = c2 - Math.Pow(r1 / r2,3) * c1;
                    resdep[12 - ktool][600 - iz] = c2.Re*factr;
                }
                else
                {
                    c1 = recx1*r1;
                    //c2 = rx1*r2;
                    //c2 = c2 - Math.Pow(r1 / r2,3) * c1;
                    resdep[ktool + 5][iz + 600] = c1.Re * factr;
                    c1 = recy1*r1;
                    //c2 = ry1*r2;
                    //c2 = c2 - Math.Pow(r1 / r2,3) * c1;
                    resdep[ktool + 5][600 - iz] = c1.Re * factr;
                }
            }
        }
    }

Here are some of the methods in the complex struct.

    public static Complex Sqrt(double x)
    {
        return x >= 0 ? new Complex(Math.Sqrt(x)) : new Complex(0, Math.Sqrt(-x));
    }

    public static Complex Exp(Complex z)
    {
        return new Complex(Math.Exp(z.Re) * Math.Cos(z.Im), Math.Exp(z.Re) * Math.Sin(z.Im));
    }

    public static Complex Log(Complex z)
    {
        return new Complex(Math.Log(Abs(z)), Arg(z));
    }

Here is part of the complex struct.

public struct Complex
{
    private readonly double _re;
    private readonly double _im;

    #region Properties

    public double Re
    {
        get { return _re; }
        //set { re = value; }
    }

    public double Im
    {
        get { return _im; }
        //set { im = value; }
    }

    public static Complex I
    {
        get { return new Complex(0.0, 1.0); }
    }

    public static Complex Zero
    {
        get { return new Complex(0.0, 0.0); }
    }

    public static Complex One
    {
        get { return new Complex(1.0, 0.0); }
    }

    #endregion


    #region constructors

    public Complex(double x)
    {
        _re = x;
        _im = 0.0;
    }

    public Complex(Complex z)
    {
        _re = z.Re;
        _im = z.Im;
    }

    public Complex(double x, double y)  //constructor
    {
        _re = x;
        _im = y;
    }

    #endregion
}

You should try getting rid of your Complex struct and instead using the built-in one. It's in the System.Numerics namespace. You might have to do a find-and-replace on your code to replace things line Complex.I with Complex.ImaginaryOne , but it should be a fairly trivial conversion.

Two advantages to this:

1) Built-in logic will be better optimized than anything you can write (or at least no worse).
2) It makes for easier maintainability, because it uses the .NET standard, so anyone can go look at the documentation, and anything which augments that will work on your code.

The best path I see is to use C++/CLI and AMP to leverage the GPU for heavy computations.

But before you do that, make sure that the performance problem is related to the DLL, and not to data marshalling etc...

I timed my algorithms with the Complex number calculations replaced by double operations and gained a 4x speed increase. Off course the answers are wrong, but I no have a baseline for the code without the overhead of the Complex operator call. This should be as fast as the code can get, if I could figure out how to inline the Complex math. The factor of 4x still leaves the code much slower than the equivalent Fortran.

So the final answer is that it can't be done. For serious number crunching, where time is important, C# does not provide an answer. I believe one has to stick with native Fortran or C++.

I thank everyone for their tips on improving the speed of C# numerics.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM