简体   繁体   English

优化此 function 以提高速度

[英]Optimizing this function for improved speed

I am trying to speed up this C function called "list_find",我正在尝试加速这个名为“list_find”的 C function,

int list_find(list_t *list, int value) {
    for (int i = 0; i < list->size; i++) {
        if (list_get(list, i) == value) {
            return i;
        }
    }
    return -1;
}

list_find searches for the index at which a particular value is stored. list_find 搜索存储特定值的索引。 This implementation calls "get" to retrieve the value at index 0, then calls get to retrieve the value at index 1, and continues in this way until it either finds the value or hits the end of the list.此实现调用“get”来检索索引 0 处的值,然后调用 get 来检索索引 1 处的值,并以这种方式继续,直到找到该值或到达列表的末尾。

Arguments: list: Pointer to a linked list struct value: The value to search for Arguments:list:指向链表结构的指针 value:要搜索的值

Returns: Index where the value is stored, or -1 if value is not found返回: 存储值的索引,如果未找到值,则返回 -1

for reference here is "list_get", which is used to retrieve the value stored at a particular index within a linked list, where the index is an index of a node with a value to retrieve:这里的参考是“list_get”,它用于检索存储在链表中特定索引处的值,其中索引是具有要检索的值的节点的索引:

int list_get(list_t *list, int index) {
    if (list == NULL) {
        return -1;
    }
    if (index >= list->size || index < 0) {
        return -1;
    }
    node_t *current = list->head;
    for (int i = 0; i < index; i++) {
        current = current->next;
    }
    return current->value;
}

Here are the structs for list_t and node_t:以下是 list_t 和 node_t 的结构:

typedef struct node {
    int value;
    struct node *next;
} node_t;


typedef struct list {
    int size;
    node_t *head;
} list_t;

I am looking to greatly improve the speed of list_find without changing any other functions or declarations.我希望在不更改任何其他函数或声明的情况下大大提高 list_find 的速度。 Essentially I am looking to only edit "list_find".本质上,我只想编辑“list_find”。 Any tips would be helpful!!任何提示都会有所帮助!!

UPDATE:更新:

int list_find(list_t *list, int value) {
    if (list == NULL)
    {
        return -1;
    }
    node_t *current = list->head;
    int size = list->size;
    for (int i = 0; i < size; i++) {
        if (i >= size || i < 0)
        {
            return -1;
        }
        if (value == current->value)
        {
            return value;
        }
        current = current->next;
    }
    return -1;
}

I have eliminated the use of list_get and iterated the list inside the function, but it doesn't seem to be faster, otherwise the way that I wrote the way it's timed in main is off.我已经消除了 list_get 的使用并迭代了 function 中的列表,但它似乎并没有更快,否则我在 main 中编写它的计时方式的方式是关闭的。 How does this look?这看起来怎么样?

The biggest problem is linked lists - the CPU has to fetch one element (with all the cache misses, etc) before it can try to fetch the next element;最大的问题是链表——CPU 必须先获取一个元素(包括所有缓存未命中等),然后才能尝试获取下一个元素; so all "out-of-order execution" parallelism is destroyed and the CPU spends almost all it's time waiting for one thing to happen at a time.因此,所有“乱序执行”并行性都被破坏了,CPU 几乎将所有时间都花在一次等待一件事发生上。

The biggest optimization is to not use a raw linked list.最大的优化是不使用原始链表。

For one example, if you used a linked list of structures where each structure contains a "next structure", a fixed size array and some meta-data to indicate which element/s of the array are used (eg first and last index);例如,如果您使用结构的链表,其中每个结构都包含一个“下一个结构”、一个固定大小的数组和一些元数据来指示使用了数组的哪些元素(例如第一个和最后一个索引); then with an array of 1234 elements iteration could be up to 1234 times faster than a raw linked list;那么对于一个包含 1234 个元素的数组,迭代速度可能比原始链表快 1234 倍; partly because the next element in the list is likely to be in the same cache line as the previous element, and mostly because the CPU won't be spending almost all of its time stalled waiting for cache misses.部分原因是列表中的下一个元素很可能与前一个元素在同一缓存行中,并且主要是因为 CPU 不会花费几乎所有的时间来等待缓存未命中。 Eg like:例如:

    struct list_block {
        struct list_block * next;
        int used_elements;
        int array[1000];
    }

    int list_find(struct list_block * block, int value) {
        int k = 0;

        while(block != NULL) {
             for(int i = 0; i < block->used_elements; i++) {
                 if(block->array[i] == value) {
                     return k+i;
                 }
             }
             k += block->used_elements;
             block = block->next;
        }
        return -1;
    }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM