[英]I am trying to improve the performance speed of my cross-correlation algorithm. What things can I do to make my C code run faster?
I created a cross-correlation algorithm, and I am trying to maximize its performance by reducing the time it takes for it to run.我创建了一个互相关算法,并试图通过减少运行时间来最大化其性能。 First of all, I reduced the number of function calls within the "crossCorrelationV2" function.首先,我减少了“crossCorrelationV2”function 中 function 调用的数量。 Second, I created several macros at the top of the program for constants.其次,我在程序顶部为常量创建了几个宏。 Third, I reduced the number of loops that are inside the "crossCorrelationV2" function.第三,我减少了“crossCorrelationV2”function 内的循环数。 The code that you see is the most recent code that I have.您看到的代码是我拥有的最新代码。
Are there any other methods I can use to try and reduce the processing time of my code?我可以使用其他方法来尝试减少代码的处理时间吗?
Let's assume that I am only focused on the functions "crossCorrelationV2" and "createAnalyzingWave".假设我只关注函数“crossCorrelationV2”和“createAnalyzingWave”。
I would be glad for any advice, whether in general about programming or pertaining to those two specific functions;我很乐意提供任何建议,无论是关于编程还是与这两个特定功能有关的建议; I am a beginner programmer.我是一个初学者程序员。 Thanks.谢谢。
#include <stdio.h>
#include <stdlib.h>
#define ARRAYSIZE 4096
#define PULSESNUMBER 16
#define DATAFREQ 1300
// Print the contents of the array onto the console.
void printArray(double array[], int size){
int k;
for (k = 0; k < size; k++){
printf("%lf ", array[k]);
}
printf("\n");
}
// Creates analyzing square wave. This square wave has unity (1) magnitude.
// The number of high values in each period is determined by high values = (analyzingT/2) / time increment
void createAnalyzingWave(double analyzingFreq, double wave[]){
int highValues = (1 / analyzingFreq) * 0.5 / ((PULSESNUMBER * (1 / DATAFREQ) / ARRAYSIZE));
int counter = 0;
int p;
for(p = 1; p <= ARRAYSIZE; p++){
if ((counter % 2) == 0){
wave[p - 1] = 1;
} else{
wave[p - 1] = 0;
}
if (p % highValues == 0){
counter++;
}
}
}
// Creates data square wave (for testing purposes, for the real implementation actual ADC data will be used). This
// square wave has unity magnitude.
// The number of high values in each period is determined by high values = array size / (2 * number of pulses)
void createDataWave(double wave[]){
int highValues = ARRAYSIZE / (2 * PULSESNUMBER);
int counter = 0;
int p;
for(p = 0; p < ARRAYSIZE; p++){
if ((counter % 2) == 0){
wave[p] = 1;
} else{
wave[p] = 0;
}
if ((p + 1) % highValues == 0){
counter++;
}
}
}
// Finds the average of all the values inside an array
double arrayAverage(double array[], int size){
int i;
double sum = 0;
// Same thing as for(i = 0; i < arraySize; i++)
for(i = size; i--; ){
sum = array[i] + sum;
}
return sum / size;
}
// Cross-Correlation algorithm
double crossCorrelationV2(double dataWave[], double analyzingWave[]){
int bigArraySize = (2 * ARRAYSIZE) - 1;
// Expand analyzing array into array of size 2arraySize-1
int lastArrayIndex = ARRAYSIZE - 1;
int lastBigArrayIndex = 2 * ARRAYSIZE - 2; //bigArraySize - 1; //2 * arraySize - 2;
double bigAnalyzingArray[bigArraySize];
int i;
int b;
// Set first few elements of the array equal to analyzingWave
// Set remainder of big analyzing array to 0
for(i = 0; i < ARRAYSIZE; i++){
bigAnalyzingArray[i] = analyzingWave[i];
bigAnalyzingArray[i + ARRAYSIZE] = 0;
}
double maxCorrelationValue = 0;
double currentCorrelationValue;
// "Beginning" of correlation algorithm proper
for(i = 0; i < bigArraySize; i++){
currentCorrelationValue = 0;
for(b = lastBigArrayIndex; b > 0; b--){
if (b >= lastArrayIndex){
currentCorrelationValue = dataWave[b - lastBigArrayIndex / 2] * bigAnalyzingArray[b] + currentCorrelationValue;
}
bigAnalyzingArray[b] = bigAnalyzingArray[b - 1];
}
bigAnalyzingArray[0] = 0;
if (currentCorrelationValue > maxCorrelationValue){
maxCorrelationValue = currentCorrelationValue;
}
}
return maxCorrelationValue;
}
int main(){
int samplesNumber = 25;
double analyzingFreq = 1300;
double analyzingWave[ARRAYSIZE];
double dataWave[ARRAYSIZE];
createAnalyzingWave(analyzingFreq, analyzingWave);
//createDataWave(arraySize, pulsesNumber, dataWave);
double maximumCorrelationArray[samplesNumber];
int i;
for(i = 0; i < samplesNumber; i++){
createDataWave(dataWave);
maximumCorrelationArray[i] = crossCorrelationV2(dataWave, analyzingWave);
}
printf("Average of the array values: %lf\n", arrayAverage(maximumCorrelationArray, samplesNumber));
return 0;
}
The first point is that you are explicitly shifting the analizingData array, this way you are required twice as much memory and moving the items is about 50% of your time.第一点是您明确地移动 analizingData 数组,这样您需要两倍的 memory 并且移动项目大约是您的 50% 的时间。 In a test here using crossCorrelationV2
takes 4.1 seconds, with the implementation crossCorrelationV3
it runs in ~2.0 seconds.在此处的测试中,使用crossCorrelationV2
需要 4.1 秒,而实施crossCorrelationV3
则需要大约 2.0 秒。
The next thing is that you are spending time multiplying by zero on the padded array, removing that, and also removing the padding, and simplifying the indices we end with crossCorrelationV4
that makes the program to run in ~1.0 second.接下来是你花时间在填充数组上乘以零,删除它,还删除填充,并简化我们以crossCorrelationV4
结尾的索引,使程序在大约 1.0 秒内运行。
// Cross-Correlation algorithm
double crossCorrelationV3(double dataWave[], double analyzingWave[]){
int bigArraySize = (2 * ARRAYSIZE) - 1;
// Expand analyzing array into array of size 2arraySize-1
int lastArrayIndex = ARRAYSIZE - 1;
int lastBigArrayIndex = 2 * ARRAYSIZE - 2; //bigArraySize - 1; //2 * arraySize - 2;
double bigAnalyzingArray[bigArraySize];
int i;
int b;
// Set first few elements of the array equal to analyzingWave
// Set remainder of big analyzing array to 0
for(i = 0; i < ARRAYSIZE; i++){
bigAnalyzingArray[i] = analyzingWave[i];
bigAnalyzingArray[i + ARRAYSIZE] = 0;
}
double maxCorrelationValue = 0;
double currentCorrelationValue;
// "Beginning" of correlation algorithm proper
for(i = 0; i < bigArraySize; i++){
currentCorrelationValue = 0;
// Instead of checking if b >= lastArrayIndex inside the loop I use it as
// a stopping condition.
for(b = lastBigArrayIndex; b >= lastArrayIndex; b--){
// instead of shifting bitAnalizing[b] = bigAnalyzingArray[b-1] every iteration
// I simply use bigAnalizingArray[b-i]
currentCorrelationValue = dataWave[b - lastBigArrayIndex / 2] * bigAnalyzingArray[b - i] + currentCorrelationValue;
}
bigAnalyzingArray[0] = 0;
if (currentCorrelationValue > maxCorrelationValue){
maxCorrelationValue = currentCorrelationValue;
}
}
return maxCorrelationValue;
}
// Cross-Correlation algorithm
double crossCorrelationV4(double dataWave[], double analyzingWave[]){
int bigArraySize = (2 * ARRAYSIZE) - 1;
// Expand analyzing array into array of size 2arraySize-1
int lastArrayIndex = ARRAYSIZE - 1;
int lastBigArrayIndex = 2 * ARRAYSIZE - 2; //bigArraySize - 1; //2 * arraySize - 2;
// I will not allocate the bigAnalizingArray here
// double bigAnalyzingArray[bigArraySize];
int i;
int b;
// I will not copy the analizingWave to bigAnalyzingArray
// for(i = 0; i < ARRAYSIZE; i++){
// bigAnalyzingArray[i] = analyzingWave[i];
// bigAnalyzingArray[i + ARRAYSIZE] = 0;
// }
double maxCorrelationValue = 0;
double currentCorrelationValue;
// Compute the correlation by symmetric paris
// the idea here is to simplify the indices of the inner loops since
// they are computed more times.
for(i = 0; i < lastArrayIndex; i++){
currentCorrelationValue = 0;
for(b = lastArrayIndex - i; b >= 0; b--){
// instead of shifting bitAnalizing[b] = bigAnalyzingArray[b-1] every iteration
// I simply use bigAnalizingArray[b-i]
currentCorrelationValue += dataWave[b] * analyzingWave[b + i];
}
if (currentCorrelationValue > maxCorrelationValue){
maxCorrelationValue = currentCorrelationValue;
}
if(i != 0){
currentCorrelationValue = 0;
// Correlate shifting to the other side
for(b = lastArrayIndex - i; b >= 0; b--){
// instead of shifting bitAnalizing[b] = bigAnalyzingArray[b-1] every iteration
// I simply use bigAnalizingArray[b-i]
currentCorrelationValue += dataWave[b + i] * analyzingWave[b];
}
if (currentCorrelationValue > maxCorrelationValue){
maxCorrelationValue = currentCorrelationValue;
}
}
}
return maxCorrelationValue;
}
If you want more optimization you can unroll some iterations of the loop and enable some compiler optimizations like vector extension.如果您想要更多优化,您可以展开循环的一些迭代并启用一些编译器优化,例如向量扩展。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.