简体   繁体   中英

Why is there not an implementation of a linked list in JavaScript?

For example Java has both ArrayList and LinkedList which behave as one would expect regarding Big O.

JavaScript has array [] , which behaves like a dynamic array as you can insert and delete to it wherever you prefer, in the middle at the end etc.

Using a linked list has better insertion and deletion time for some cases with large data sets. I would prefer not to implement one myself or use a library. Are there plans to add it it in the future if it does not exist?

To explain this, lets explain why java's LinkedList is almost entirely useless. This should then segue into why an actual useful LinkedList is a bit tricky to implement, API-wise.

Using a linked list has better insertion and deletion time for some cases with large data sets.

No it doesn't, except in very very exotic circumstances. Let's get to it.

Let's say you have a list with 50 elements inside it, and you want to add something right in the middle. For arraylists, this means the system has to 'move' 25 elements up a slot so that there's 'room', and it gets a little worse if the backing array is exactly at capacity (in which case we have to create a new array, copy the original to this newly created one in two chunks and then set the new value at the right index). But at least the system knows where to cut.

Now lets look at Linked Lists. In theory, it's an O(1) operation: Create a new tracker object, set its 'prev' to the 24th element, its 'next' to the 25th element, then update the tracker of the 24th element to have its 'next' point to the new tracker, and update the tracker of what was the 25th element so that its 'prev' points to the new tracker. Done. This algorithm works even if the list has a few bazillion entries in it. O(1). Right?

No. The problem is, how do you get there?

list.add(25, newItem) cannot magically skip around LinkedList's nearly fatal downside, which is: It has to iterate through 25 elements just to get to the right tracker in the first place . In other words, LinkedList's .add(idx, newItem) method is just as O(n) as ArrayList's is!

If you leave O(n) land behind and get to pragmatics one might say that LinkedList has excellent performance when you add near the 'start' of the list, whereas ArrayList would be at its worst there (LinkedList needs to iterate only a few trackers to get to the right one, whereas ArrayList needs to move a gigantic chunk), but you don't want to do that - when we leave theoretic performance models (that is, O(n) notation) behind and get to actual performance, LinkedList's is truly deplorable - just about everything that isn't algorithmic that could go wrong, does go wrong with LinkedList.

LinkedList's 'fast at the start' grows to perfection once we talk about 'add at the beginning', with a guaranteed O(1) behaviour. However, this is kinda fake - you can trivially get the same performance out of an arraylist, simply wrap it into a thing that reverses all operations ( .get(x) is implemented as .get(size() - 1 - x) , for example), because ArrayList is O(1) for insertion at the end. So that's not much of a bonus there. Mostly, just use ArrayDeque, which has fantastic performance for add-near-start and just as great for add-near-end.

About those pragmatics:

LinkedList needs a tracker object: An extra object which has 3 fields:

  • value : The actual value at this position in the linked list
  • prev : The tracker object for the item before us in this list (because LinkedList is two-way traversable; if you don't need two-way, you can leave this one out). It's null for the first element.
  • next : The tracker object for the item after us in this list - it's null for the last.

Finally the list itself is simply 2 fields: start and end , with start pointing at the tracker of the first object and end at the tracker of the last. With an empty linkedlist having null values.

Those tracker objects are expensive .

Modern CPUs cannot access memory. At all. They can only access an on-die cache which comes in an entire page. The only operation that the CPU can send to the memory controller is 'flush this ENTIRE page to main memory' (it cannot flush half of a page; a page size depends on your CPU, but think 64k or so), and 'replace this on-die cache by loading in this ENTIRE page from main memory'. And both of these operations take 500 or more CPU cycles - so the CPU is really gonna do quite the thumb twiddling whilst the slow-as-molasses memory controller does its work. That main memory bank is quite a few lightnanoseconds away from the CPU, that alone makes it slow as molasses!

Hence, when you are talking about a smallish arraylist, given that the JVM guarantees that arrays are 'contiguous in memory', as long as the entire list fits in a single page, then effectively all operations on it are instantaneous , and that whole O(n) thing sounds nice but, it's just entirely pointless, really.

As they say, In theory, the practice is just like the theory. But in practice.... it usually is nowhere near.

LinkedList goes in the opposite direction - the nature of modern CPU design ('modern' in quotes here - on-die cache pages and no actual direct memory access by a CPU is at this point over a decade old) actually is bad news: Those extra trackers have a nasty tendency to not be contiguous , meaning a full traversal of a linked list causes a load of cache misses, and each cache miss comes with a free 500+ CPU cycles worth of thumb twiddling. OUCH.

So how do I squeeze O(1) performance out of this thing??

In java? The only way is to use .listIterator() , or .iterator().remove() . The only way to get O(1) performance out of a LinkedList where ArrayList has O(n), is via those!

You can use the ListIterator to iterate your way to the right position (this will be O(n) , if you want to add in the middle), but from there you can add as much as you like, and each .add operation is indeed O(1), though the trackers you're making are likely in a different cache page, so you're negatively impacting any future performance of this list.

That sucks. Is there a better way?

There surely is! Imaginea linkedlist of Strings. But now imagine that java's own String class has 2 more fields than you're used to: String next; and String prev; , with next and prev pointing at the next/previous string in the list. Now you can just 'add a new thing between this string and the next string in the list please', from any string, by simply updating .next.prev , and then .next , to point at the new string (and, of course, assigning the next and prev fields in the new string with the right values). Now it doesn't matter how you got to any item in the list, once you have it, obtained however, you can do O(1) ops on the list. And we even get to 'save' on trackers - we don't need em (fields themselves within a single object are guaranteed contiguous, though note that all non-primitive fields are, of course, pointers, and the thing they're pointing at might not be).

But java doesn't work that way.

Some languages make it easy to make an ersatz 'combi-type' that acts in memory as a single new type (ie guarantees contiguousness of the combination of some type and some additions to the type), this is often called 'mixins'. With that power, you can make your own linked lists, and you wouldn't even have any type named LinkedList - some type just 'grows' next and prev variables on command.

java just isn't like that. Javascript can be - objects really are just hashmaps and you're free to introduce a prev and next pointer if you want. To do this, you don't need any built-in types of any sort, you just need a tutorial, really.

Still, it'd have been nice

Actually javascript has almost no batteries included. Famously, something as crazy simplistic as left-padding a string needs a rather notorious library .

So, more generally, the answer to 'why does javascript not have X baked in' is a very generally applicable: 'Because javascript just doesnt have much baked in at all'.

I don't believe you - I am a card carrying member of the church of O(n) !

Well, some skepticism is quite healthy as a programmer, so good for you!

You should write some code to test your preconceptions. And to test my theories while you're at it!

Make code that eg inserts in the middle using list.set(list.size() / 2, newElem) and time it for both when list is an instance of ArrayList as well as for an instance of LinkedList. Make sure you use a framework to knows how to properly do this, because between hotspot compilation, JVM warming, optimizations to skip code that doesn't produce any results that are used entirely, modern CPU design, and the fact that modern OSes aren't realtime, it's really hard to do that. Hence, use Java Microbenchmark Harness to run these tests.

You'll find that it is quite difficult to create a scenario where LinkedList significantly outperforms ArrayList, and quite easy for the reverse. Even in cases where basic big-O notation would suggest otherwise.

So what should I use instead?

There are certainly scenarios where ArrayList's performance characteristics are no good. However, for just about every imaginable scenario where that's true, LinkedList is not the best, or even a good answer as an alternative. Instead, look at ArrayDeque , or rewrite the algorithm to use TreeMap or HashMap , use databases, skiplists, primitive lists (as you can have really quite sizable primitive lists and still get fantastic performance), or enumsets.

Most of which do not have javascript equivalents either, but all of which have third party libraries in the node ecosystem that do. Of course, if you do that, you may eventually run into that whole padLeft debacle thing, but then you're kinda signing up for that the moment you decide to use javascript in the first place.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM