简体   繁体   中英

C++ time spent allocating vectors

I am trying to speed up a piece of code that is ran a total of 150,000,000 times.

I have analysed it using "Very Sleepy", which has indicated that the code is spending the most time in these 3 areas, shown in the image:

花时间

The code is as follows:

double nonLocalAtPixel(int ymax, int xmax, int y, int x , vector<nodeStructure> &nodeMST, int squareDimension, Mat &inputImage) {

    vector<double> nodeWeights(8,0);
    vector<double> nodeIntensities(8,0);
    bool allZeroWeights = true;
    int numberEitherside = (squareDimension - 1) / 2;
    int index = 0;
    for (int j = y - numberEitherside; j < y + numberEitherside + 1; j++) {
        for (int i = x - numberEitherside; i < x + numberEitherside + 1; i++) {

            // out of range or the centre pixel
            if (j<0 || i<0 || j>ymax || i>xmax || (j == y && i == x)) {
                index++;
                continue;
            }
            else {
                int centreNodeIndex = y*(xmax+1) + x;
                int thisNodeIndex = j*(xmax+1) + i;

                // add to intensity list
                Scalar pixelIntensityScalar = inputImage.at<uchar>(j, i);
                nodeIntensities[index] = ((double)*pixelIntensityScalar.val);
                // find weight from p to q
                float weight = findWeight(nodeMST, thisNodeIndex, centreNodeIndex);
                if (weight!=0 && allZeroWeights) {
                    allZeroWeights = false;
                }
                nodeWeights[index] = (weight);
                index++;
            }
        }
    }


    // find min b
    int minb = -1;
    int bCost = -1;

    if (allZeroWeights) {
        return 0;
    }
    else {
        // iteratate all b values 
        for (int i = 0; i < nodeWeights.size(); i++) {
            if (nodeWeights[i]==0) {
                continue;
            }
            double thisbCost = nonLocalWithb(nodeIntensities[i], nodeIntensities, nodeWeights);

            if (bCost<0 || thisbCost<bCost) {
                bCost = thisbCost;
                minb = nodeIntensities[i];
            }
        }
    }
    return minb;
}

Firstly, I assume the spent time indicated by Very Sleepy means that the majority of time is spent allocating the vector and deleting the vector?

Secondly, are there any suggestions to speed this code up?

Thanks

  • use std::array
  • reuse the vectors by passing it as an argument of the function or a global variable if possible (not aware of the structure of the code so I need more infos)
  • allocate one 16 vector size instead of two vectors of size 8. Will make your memory less fragmented
  • use parallelism if findWeight is thread safe (you need to provide more details on that too)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM