简体   繁体   中英

How to speed up C# math code

I have some 3d interpolation code that takes up 90% of my projects runtime and cannot be precomputed.

What are some techniques that I could use to speed this up? Algorithmic or Micro Optimization?

Here is the code for those interested.

It basically takes data that was placed across the 2 3d arrays and interpolates the rest of the data.

EDIT: Also I am already spliting this into threads at a higher level for increased performance, but this doesn't help on the windows phone as they are all single core...

I will probably do something like (Single[] DensityMap = new Single[128 * 128 * 128];) to remove the multi-D array hit. I access the array in 100's of places and was hoping to not have to do that (wrapping in a function doesn't help as the windows phone won't inline the function call and it doesn't help perf then...)

float[, ,] DensityMap = new float[128, 128, 128];
float[, ,] PressureMap = new float[128, 128, 128];

unchecked
{
    for (int x = 0; x < g_CraftWorldConstants.RegionSizeX; x++)
    {
        int offsetX = (x / SAMPLE_RATE_3D_HOR) * SAMPLE_RATE_3D_HOR;
        int plusOffsetX = SAMPLE_RATE_3D_HOR + offsetX;
        int poxox = plusOffsetX - offsetX;
        double poxxpoxox = ((plusOffsetX - x) / (double)poxox);
        double xoxpoxox = ((x - offsetX) / (double)poxox);

        for (int y = 0; y < g_CraftWorldSettings.GET.RegionSizeY; y++)
        {
            int offsetY = (y / SAMPLE_RATE_3D_VERT) * SAMPLE_RATE_3D_VERT;
            int plusOffsetY = SAMPLE_RATE_3D_VERT + offsetY;
            int poyoy = plusOffsetY - offsetY;
            double poyypoyoy = ((plusOffsetY - y) / (double)poyoy);
            double yoypoyoy = ((y - offsetY) / (double)poyoy);

            for (int z = 0; z < g_CraftWorldConstants.RegionSizeZ; z++)
            {
                if (!(x % SAMPLE_RATE_3D_HOR == 0 && y % SAMPLE_RATE_3D_VERT == 0 && z % SAMPLE_RATE_3D_HOR == 0))
                {
                    int offsetZ = (z / SAMPLE_RATE_3D_HOR) * SAMPLE_RATE_3D_HOR;
                    int plusOffsetZ = SAMPLE_RATE_3D_HOR + offsetZ;
                    int pozoz = plusOffsetZ - offsetZ;
                    double pozzpozoz = ((plusOffsetZ - z) / (double)pozoz);
                    double zozpozoz = ((z - offsetZ) / (double)pozoz);

                    double x00 = poxxpoxox * in_DensityMap[offsetX, offsetY, offsetZ] + xoxpoxox * in_DensityMap[plusOffsetX, offsetY, offsetZ];
                    double x10 = poxxpoxox * in_DensityMap[offsetX, offsetY, plusOffsetZ] + xoxpoxox * in_DensityMap[plusOffsetX, offsetY, plusOffsetZ];
                    double x01 = poxxpoxox * in_DensityMap[offsetX, plusOffsetY, offsetZ] + xoxpoxox * in_DensityMap[plusOffsetX, plusOffsetY, offsetZ];
                    double x11 = poxxpoxox * in_DensityMap[offsetX, plusOffsetY, plusOffsetZ] + xoxpoxox * in_DensityMap[plusOffsetX, plusOffsetY, plusOffsetZ];

                    double r0 = poyypoyoy * x00 + yoypoyoy * x01;
                    double r1 = poyypoyoy * x10 + yoypoyoy * x11;
                    in_DensityMap[x, y, z] = (float)(pozzpozoz * r0 + zozpozoz * r1);

                    double x02 = poxxpoxox * in_CaveDensity[offsetX, offsetY, offsetZ] + xoxpoxox * in_CaveDensity[plusOffsetX, offsetY, offsetZ];
                    double x12 = poxxpoxox * in_CaveDensity[offsetX, offsetY, plusOffsetZ] + xoxpoxox * in_CaveDensity[plusOffsetX, offsetY, plusOffsetZ];
                    double x03 = poxxpoxox * in_CaveDensity[offsetX, plusOffsetY, offsetZ] + xoxpoxox * in_CaveDensity[plusOffsetX, plusOffsetY, offsetZ];
                    double x13 = poxxpoxox * in_CaveDensity[offsetX, plusOffsetY, plusOffsetZ] + xoxpoxox * in_CaveDensity[plusOffsetX, plusOffsetY, plusOffsetZ];

                    double r2 = poyypoyoy * x02 + yoypoyoy * x03;
                    double r3 = poyypoyoy * x12 + yoypoyoy * x13;
                    in_CaveDensity[x, y, z] = (float)(pozzpozoz * r2 + zozpozoz * r3);
                }
            }
        }
    }
}

It seems that you have a lot of opportunities to optimize your code. Your x loop executes 128 times, your y loop executes 128*128=16,384 times, and your z loop executes 128^3=2,097,152 times. There are a number of terms inside your z loop that are only dependent on the x, or the y iterations, but they are recalculated at every z iteration. For example,

int poxox = plusOffsetX - offsetX;

and

double poxxpoxox = ((plusOffsetX - x) / (double)poxox);

These two terms are being calculated more than 2 million times, but only need to be calculated 128 times if my cursory scan of your function is correct. Move terms to the loop level that is appropriate so you don't waste cycles recalculating the same values many multiple times.

Here is your code with basic optimizations made. I'm curious to know how this affects your run times. Several of the terms are only dependent on the iteration value, and are the same for x, y, and z. So I pulled them out entirely and precompute them once. I also have moved the outer mod operations out of the inner loop, and modified the logic to ensure short circuit of the evaluation, which should should remove the majority of mod operations that were previously being executed.

int[] offsets = new int[128];
int[] plusOffsets = new int[128];
double[] poii = new double[128];
double[] ioip = new double[128];
for (int i = 0; i < 128; i++) {
    offsets[i] = (i / SAMPLE_RATE_3D_HOR) * SAMPLE_RATE_3D_HOR;
    plusOffsets[i] = SAMPLE_RATE_3D_HOR + offsets[i];
    double poioi = (double) (plusOffsets[i] - offsets[i]);
    poii[i] = ((plusOffsets[i] - i) / poioi);
    ioip[i] = ((i - offsets[i]) / poioi);
}

float[, ,] DensityMap = new float[128, 128, 128];
float[, ,] PressureMap = new float[128, 128, 128];

for (int x = 0; x < g_CraftWorldConstants.RegionSizeX; x++)
{
    int offsetX = offsets[x];
    int plusOffsetX = plusOffsets[x];
    double poxxpoxox = poii[x];
    double xoxpoxox = ioip[x];
    bool xModNot0 = !(x % SAMPLE_RATE_3D_HOR == 0);

    for (int y = 0; y < g_CraftWorldConstants.RegionSizeY; y++)
    {
        int offsetY = offsets[y];
        int plusOffsetY = plusOffsets[y];
        double poyypoyoy = poii[y];
        double yoypoyoy = ioip[y];
        bool yModNot0 = !(y % SAMPLE_RATE_3D_VERT == 0);

        for (int z = 0; z < g_CraftWorldConstants.RegionSizeZ; z++)
        {
            //if (!(x % SAMPLE_RATE_3D_HOR == 0 && y % SAMPLE_RATE_3D_VERT == 0 && z % SAMPLE_RATE_3D_HOR == 0))
            if (xModNot0 || yModNot0 || !(z % SAMPLE_RATE_3D_HOR == 0))
            {
                int offsetZ = offsets[z];
                int plusOffsetZ = plusOffsets[z];
                double pozzpozoz = poii[z];
                double zozpozoz = ioip[z];

                double x00 = poxxpoxox * DensityMap[offsetX, offsetY, offsetZ] + xoxpoxox * DensityMap[plusOffsetX, offsetY, offsetZ];
                double x10 = poxxpoxox * DensityMap[offsetX, offsetY, plusOffsetZ] + xoxpoxox * DensityMap[plusOffsetX, offsetY, plusOffsetZ];
                double x01 = poxxpoxox * DensityMap[offsetX, plusOffsetY, offsetZ] + xoxpoxox * DensityMap[plusOffsetX, plusOffsetY, offsetZ];
                double x11 = poxxpoxox * DensityMap[offsetX, plusOffsetY, plusOffsetZ] + xoxpoxox * DensityMap[plusOffsetX, plusOffsetY, plusOffsetZ];

                double r0 = poyypoyoy * x00 + yoypoyoy * x01;
                double r1 = poyypoyoy * x10 + yoypoyoy * x11;
                DensityMap[x, y, z] = (float)(pozzpozoz * r0 + zozpozoz * r1);

                double x02 = poxxpoxox * PressureMap[offsetX, offsetY, offsetZ] + xoxpoxox * PressureMap[plusOffsetX, offsetY, offsetZ];
                double x12 = poxxpoxox * PressureMap[offsetX, offsetY, plusOffsetZ] + xoxpoxox * PressureMap[plusOffsetX, offsetY, plusOffsetZ];
                double x03 = poxxpoxox * PressureMap[offsetX, plusOffsetY, offsetZ] + xoxpoxox * PressureMap[plusOffsetX, plusOffsetY, offsetZ];
                double x13 = poxxpoxox * PressureMap[offsetX, plusOffsetY, plusOffsetZ] + xoxpoxox * PressureMap[plusOffsetX, plusOffsetY, plusOffsetZ];

                double r2 = poyypoyoy * x02 + yoypoyoy * x03;
                double r3 = poyypoyoy * x12 + yoypoyoy * x13;
                PressureMap[x, y, z] = (float)(pozzpozoz * r2 + zozpozoz * r3);
            }
        }
    } 
}

There are some things you can do to speed up your code:

  • Avoid using multidim.-arrays because they are slow
  • Use multiple threads
  • Store variables which going to be cast as a double in a double variable
  • Precalculate everything you can ( see hatchet's post )

Arrays

To simulate the 3D-array, you just can do it this way:

Single[] DensityMap = new Single[128 * 128 * 128];
DensityMap[z + (y * 128) + (x * 128 * 128)] = ...;

Use a jagged array rather than multi-dimensional, ie do

float[][][] DensityMap = new float[128][][];

And then create the inner arrays using for loops, or the LINQ syntax (which might be sub-optimal).

This will give MUCH better performance than using a multi-dimensional array, and equal or better performance than using a single-dimensional array and calculating the offsets yourself. That is, unless the cost of initializing the jagged array is significant; it will after all create 128^2 arrays. I'd benchmark it and only revert to a single-dimensional array if the cost really is significant.

You can change your for loops since you're not doing anything for the in-between values of all of these

for (int x = 0; x < 128; x+= SAMPLE_RATE_3D_HOR) {
   for (int y = 0; y < 128; y+= SAMPLE_RATE_3D_VERT) {
      for (int z = 0; z < 128; z+= SAMPLE_RATE_3D_HOR) {

Doing these in parallel would be even better.

With this you can eliminate the 6 million mod % calculations and 60+ thousand multiplies.

--edit-- Sorry, I missed the "!" on your line with the 3 mods. You can still skip some of those calculations. See comments below.

1) Do you really need doubles? Especially you are mixing some floats, doubles and ints.

2) You should precalculate the k / SAMPLE_RATE_3D_HOR * SAMPLE_RATE_3D_HOR pattern.

int pre_calc[128];
for( int i = 0; i < 128; ++i )
    pre_calc[i] = (i / SAMPLE_RATE_3D_HOR) * SAMPLE_RATE_3D_HOR;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM