简体   繁体   English

C for循环优化

[英]C for loops optimization

I'm trying to learn how to optimize my c code, so I found some articles on the internet and remade my function so that it should execute faster. 我试图学习如何优化我的C代码,所以我在网上找到了一些文章并重新制作了函数,以便使其执行得更快。 And when I compile it without optimization flags it works (second function is about 12% faster than first), but when I use it with gcc -O3 then second function is much slower (about 50%). 当我在不使用优化标志的情况下进行编译时,它可以工作(第二个函数比第一个函数快约12%),但是当我将其与gcc -O3一起使用时,第二个函数要慢得多(约50%)。 Do you have any idea why is that? 你知道为什么吗? Thanks for any help. 谢谢你的帮助。

First function: 第一个功能:

typedef struct {
    double *data;
    int rows;
    int columns;
} Matrix;

Matrix *matrixMultiplication(Matrix *a, Matrix *b) {
    if(a->columns != b->rows)
        return NULL;
    Matrix *matrix = createMatrix(a->rows, b->columns);
    set(0, matrix);
    for(int i = 0; i < matrix->rows; i++) {
        for(int j = 0; j < a->columns; j++) {
            for(int k = 0; k < b->columns; k++) {
                matrix->data[i * matrix->columns + k] += a->data[i * a->columns + j] * b->data[j * b->columns + k];
            }
        }
    }
    return matrix;
}

Second function: 第二功能:

typedef struct {
    float *data;
    unsigned int rows;
    unsigned int columns;
} Matrix_2;

unsigned int matrixMultiplication_2(Matrix_2 *a, Matrix_2 *b, Matrix_2 **c) {
    Matrix_2 *matrix;
    if(a->columns != b->rows)
        return 0;
    createMatrix_2(a->rows, b->columns, &matrix);
    set_2(0, matrix);
    for(unsigned int i = matrix->rows; i--;) {
        for(unsigned int j = a->columns; j--;) {
            for(unsigned int k = b->columns; k--;) {
                matrix->data[i * matrix->columns + k] += a->data[i * a->columns + j] * b->data[j * b->columns + k];
            }
        }
    }
    *c = matrix;
    return 1;
}

That's because compiler optimizations are based on pattern recognition . 这是因为编译器优化基于模式识别 Your compiler knows a ton of typical code patterns, and knows how to transform them to yield faster code. 您的编译器知道很多典型的代码模式,并且知道如何转换它们以产生更快的代码。 However, while this library of code patterns is quite extensive, it's finite. 但是,尽管此代码模式库相当广泛,但它是有限的。

The first function uses the canonical for(int i = 0; i < count; i++) loop control. 第一个函数使用规范的for(int i = 0; i < count; i++)循环控制。 You can bet that any compiler worth its salt has a pattern for this, yielding close to optimal code for the loop control. 您可以打赌,任何值得赞扬的编译器都会为此提供模式,从而为循环控制产生接近最佳的代码。

The second function uses a pattern that's rarely seen in human code. 第二个功能使用的模式在人工代码中很少见。 While I personally like this pattern for its brevity, there are many programmers out there that find it too cryptic to be used. 尽管我个人很喜欢这种模式的简洁性,但仍有许多程序员认为它过于神秘而无法使用。 Obviously, your compiler does not come with an optimizer pattern for this, so the resulting code does not get fully optimized. 显然,您的编译器没有为此提供优化器模式,因此生成的代码没有得到完全优化。


Optimizations like replacing for(int i = 0; i < count; i++) with for(int i = count; i--;) were useful when C was still little more than a high-level assembler. 当C仍然只是高级汇编程序时,诸如for(int i = 0; i < count; i++)替换for(int i = count; i--;)的优化非常有用。 But compiler optimizations have long turned code translation into a much too complicated beast to be optimized by such tricks. 但是长期以来,编译器优化已将代码转换变成过于复杂的野兽,无法通过此类技巧进行优化。 Today, most optimizations need to be done on the algorithmic level. 今天,大多数优化都需要在算法层面上完成。 Translation level optimizations should generally be left to the compiler and fostered by writing canonical code that the compiler can optimize. 翻译级别的优化通常应留给编译器,并通过编写编译器可以优化的规范代码来促进。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM