简体   繁体   中英

Linked lists or hash tables?

I have a linked list of around 5000 entries ("NOT" inserted simultaneously), and I am traversing the list, looking for a particular entry on occasions (though this is not very often), should I consider Hash Table as a more optimum choice for this case, replacing the linked list (which is doubly-linked & linear) ?? Using C in Linux.

Whether using hash table is more optimum or not depends on the use case, which you have not described in detail. But more importantly, make sure the bottleneck of performance is in this part of the code. If this code is called only once in a while and not in a critical path, no use bothering to change the code.

If you have not found the code to be the slow part of the application via a profiler then you shouldn't do anything about it yet.

If it is slow, but the code is tested, works, and is clear, and there are other slower areas that you can work on speeding up do those first.

If it is buggy then you need to fix it anyways, go for the hash table as it will be faster than the list. This assumes that the order that the data is traversed does not matter, if you care about what the insertion order is then stick with the list (you can do things with a hash table and keep the order, but that will make the code much tricker).

Given that you need to search the list only on occasion the odds of this being a significant bottleneck in your code is small.

Another data structure to look at is a "skip list" which basically lets you skip over a large portion of the list. This requires that the list be sorted however, which, depending on what you are doing, may make the code slower overall.

Have you measured and found a performance hit with the lookup? A hash_map or hash table should be good.

If you need to traverse the list in order (not as a part of searching for elements, but say for displaying them) then a linked list is a good choice. If you're only storing them so that you can look up elements then a hash table will greatly outperform a linked list (for all but the worst possible hash function).

If your application calls for both types of operations, you might consider keeping both, and using whichever one is appropriate for a particular task. The memory overhead would be small, since you'd only need to keep one copy of each element in memory and have the data structures store pointers to these objects.

As with any optimization step that you take, make sure you measure your code to find the real bottleneck before you make any changes.

If you care about performance, you definitely should. If you're iterating through the thing to find a certain element with any regularity, it's going to be worth it to use a hash table. If it's a rare case, though, and the ordinary use of the list is not a search, then there's no reason to worry about it.

如果仅遍历集合,则看不到使用哈希图的任何优势。

I advise against hashes in almost all cases.

There are two reasons; firstly, the size of the hash is fixed.

Second and much more importantly; the hashing algorithm. How do you know you've got it right? how will it behave with real data rather than test data?

I suggest a balanced b-tree. Always O(log n), no uncertainty with regard to a hash algorithm and no size limits.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM