简体   繁体   中英

Node.js Event loop

Is the Node.js I/O event loop single- or multithreaded?

If I have several I/O processes, node puts them in an external event loop. Are they processed in a sequence (fastest first) or handles the event loop to process them concurrently (...and in which limitations)?

Event Loop

The Node.js event loop runs under a single thread, this means the application code you write is evaluated on a single thread. Nodejs itself uses many threads underneath through libuv, but you never have to deal with with those when writing nodejs code.

Every call that involves I/O call requires you to register a callback. This call also returns immediately, this allows you to do multiple IO operations in parallel without using threads in your application code. As soon as an I/O operation is completed it's callback will be pushed on the event loop. It will be executed as soon as all the other callbacks that where pushed on the event loop before it are executed.

There are a few methods to do basic manipulation of how callbacks are added to the event loop. Usually you shouldn't need these, but every now and then they can be useful.

At no point will there ever be two true parallel paths of execution, so all operations are inherently thread safe. There usually will be several asynchronous concurrent paths of execution that are being managed by the event loop.

Read More about the event loop

Limitations

Because of the event loop, node doesn't have to start a new thread for every incoming tcp connection. This allows node to service hundreds of thousands of requests concurrently , as long as you aren't calculating the first 1000 prime numbers for each request.

This also means it's important to not do CPU intensive operations, as these will keep a lock on the event loop and prevent other asynchronous paths of execution from continuing. It's also important to not use the sync variant of all the I/O methods, as these will keep a lock on the event loop as well.

If you want to do CPU heavy things you should ether delegate it to a different process that can execute the CPU bound operation more efficiently or you could write it as a node native add on .

Read more about use cases

Control Flow

In order to manage writing many callbacks you will probably want to use a control flow library. I believe this is currently the most popular callback based library:

I've used callbacks and they pretty much drove me crazy, I've had much better experience using Promises, bluebird is a very popular and fast promise library:

I've found this to be a pretty sensitive topic in the node community (callbacks vs promises), so by all means, use what you feel will work best for you personally. A good control flow library should also give you async stack traces, this is really important for debugging.

The Node.js process will finish when the last callback in the event loop finishes it's path of execution and doesn't register any other callbacks.

This is not a complete explanation, I advice you to check out the following thread, it's pretty up to date:

How do I get started with Node.js

From Willem's answer:

The Node.js event loop runs under a single thread. Every I/O call requires you to register a callback. Every I/O call also returns immediately, this allows you to do multiple IO operations in parallel without using threads.

I would like to start explaining with this above quote, which is one of the common misunderstandings of node js framework that I am seeing everywhere.

Node.js does not magically handle all those asynchronous calls with just one thread and still keep that thread unblocked. It internally uses google's V8 engine and a library called libuv(written in c++) that enables it to delegate some potential asynchronous work to other worker threads (kind of like a pool of threads waiting there for any work to be delegated from the master node thread). Then later when those threads finish their execution they call their callbacks and that is how the event loop is aware of the fact that the execution of a worker thread is completed.

The main point and advantage of nodejs is that you will never need to care about those internal threads and they will stay away from your code!. All the nasty sync stuff that should normally happen in multi threaded environments will be abstracted out by nodejs framework and you can happily work on your single thread (main node thread) in a more programmer friendly environment (while benefiting from all the performance enhancements of multiple threads).

Below is a good post if anyone is interested: When is the thread pool used?

you have to know first about nodeJs implementaion in order to know event loop.

actually node js core implementation using two components :

  • v8 javascript runtime engine

  • libuv for handlign non i/o blocking operation and handling threads and concurrent operations for you;

with the javascript you can actually write code with one thread but this means not that your code execute on the one thread although you can execute on multiple thread s using clusters in node js

now when you want to execute some code like :

 let fs = require('fs'); fs.stat('path',(err,stat)=>{ //do something with the stat; console.log('second'); }); console.log('first');

  • the execution of this code at high level is like this: first the v8 engine run this code and then if there is no error everything is good then it looks for the it try to run it run line by line when it gets to the fs .stats this is a node js api very similar to the web apis like setTimeout that the browser handle it for us when it encounter to the fs.stats it is pass the code to the libuv components with a flag and pass your callback to the event queue then the libuv you execute your code during the operation and when its done just send some signal and then d the v8 execute your code az a callback you set on the queue but it always check for the stack is empty then go for the your code on the queue # always remember that !

Well, to understand nodejs I/O events in the event, you must understand nodejs event loop properly.

from the name event loop, we understand it's a loop that runs cycle after cycle round-robin basis until there are no events remains in the loop or the app closed.

The event loop is one of the topmost features in nodejs, it is what makes async programming in nodejs.

When the program starts we are in a node process in the single thread where the event loop runs. Now the most importing things we need to know that the event loop is where all the application code that is inside callback functions is executed.

So, basically all code that is not top-level code will run in the event loop. Some part (mostly heavy duties) might get offloaded to the thread pool ( When is the thread pool used? ), the event loop will take care of those heavy duties and return the result to the event of the event loop.

It is the heart of the node architecture, and nodejs built around callback functions. so callbacks will triggered as soon as some work is finished sometime in the future because node uses an event-triggered architecture.

When an application receives an HTTP request on a node server or a timer expiring or a file finishing to read all these will emit events as soon as they are done with their work, and our event loop will then pick up these events and call the callback functions that are associated with each event, it's usually said that the event loop does the orchestration, which simply means that it receives events, calls their callback functions, and offloads the more expensive tasks to the thread pool. 在此处输入图片说明 Now, how does all this actually work behind the scenes? In what order are these callbacks executed?

Well, when we start our node application, the event loop starts running right away. An event loop has multiple phases, and each phase has a callback queue, where the four most important phases are 1. Expired timer callbacks, 2.I/O polling and callbacks 3. setImmediate callbacks, and 4. Close callbacks. There are other phases that is used internally by Node.

在此处输入图片说明

So, the first phase takes care of callbacks of expired timers, for example, from the setTimeout() function. So, if there are callback functions from timers that just expired, these are the first ones to be processed by the event loop.

** The most important thing is, If a timer expires later during the time when one of the other phases is being processed, well then the callback of that timer will only be called as soon as the event loop comes back to this first phase. And it works like this in all four phases.**

So callbacks in each queue are processed one by one until there are no ones left in the queue and only then, the event loop will enter the next phase. for example, suppose there is 1000 setTimeOut callbacks timer expired and the event loop is in the first phase then all these 1000 setTimeOuts callbacks will execute one by one then it will go to the next phase(I/O pooling and callbacks).

Next up, we have I/O pooling and execution of I/O callbacks. Here I/O stands for input/output and polling basically means looking for new I/O events that are ready to be processed and putting theme into the callback queue.

In the context of a Node application, I/O means mainly stuff like networking and file access, so in this phase where probably 99% of general application code gets executed.

The next phase is for setImmediate callbacks, and SetImmediate is a special kind of timer that we can use if we want to process callbacks immediately after the I/O polling and execution phase.

And finally, the fourth phase is the close callbacks, in this phase, all close events are processed, for example when a server or a WebSocket shut down.

These are the four phases in the event loop, but besides these four callbacks queues there are actually also two other queues, 1. nextTick() other 2. microtasks queue(which is mainly for resolved promises)

在此处输入图片说明

If there are any callbacks in one of these two queues to be processed, they will be executed right after the current phase of the event loop finishes instead of waiting for the entire loop/cycle to finish.

In other words, after each of these four phases, if there are any callbacks in these two special queues, they will be executed right away. Now imagine that a promise resolves and returns some data from an API call while the callback of an expired timer is running, In this case, the promise callback will be executed right after the one from the timer finish.

The same logic also applies to the nextTick() queue. The nextTick() is a function that we can use when we really, really need to execute a certain callback right after the current event loop phase. It's a bit similar to setImmediate, with the difference that setImmediate only runs after the I/O callback phase.

Will all the above things can happen in one tick/cycle of the event loop, In the meantime their new events could have arisen in a particular phase or old event could be expired, the event loop will handle those events with another new cycle.

So now it's time to decide whether the loop should continue to the next tick or if the program should exit. Node simply checks whether there are any timers or I/O tasks that are still running in the background if there aren't any then it will exit the application. But if there are any pending timers or I/O tasks, then the node will continue running the event loop and go starting to the next cycle.

在此处输入图片说明

For example, in node application when we are listening for incoming HTTP requests, we basically running an infinite I/O task, and that is run in the event loop, for that Node.js keep running and keep listening for new HTTP request coming in instead of just exiting the application.

Also when we are writing or reading a file in the background that's also an I/O task and it makes sense that the app doesn't exist while it's working with that file, right?

Now The event loop in practices:

const fs = require('fs');
setTimeout(()=>console.log('Timer 1 finished'), 0);
fs.readFile('test-file.txt', ()=>{
    console.log('I/O finished');
});
setImmediate(()=>console.log('Immediate 1 finished'))
console.log('Hello from the top level code');

Output: 在此处输入图片说明 Well the first lin is Hello from the top level code , yes it is expected because this is a code that gets executed immediately. Then after we have three output, Timer 1 finished this line is expected because of phase one as we discuess before, but after that I/O finished should be printed, because we discuess that setImmediate runs after the I/O callback phase, but this code is actually not in an I/O cycle, so it is not running inside of the event loop, because it's not runnin inside of any callback function.

Now lets do another test:

const fs = require('fs');
setTimeout(()=>console.log('Timer 1 finished'), 0);
setImmediate(()=>console.log('Immediate 1 finished'));

fs.readFile('test-file.txt', ()=>{
    console.log('I/O finished');
    setTimeout(()=>console.log('Timer 2 finished'), 0);
    setImmediate(()=>console.log('Immediate 2 finished'));
    setTimeout(()=>console.log('Timer 3 finished'), 0);
    setImmediate(()=>console.log('Immediate 3 finished'));
});
console.log('Hello from the top level code')

Output: 在此处输入图片说明

The output is as expected right? Now let's add some delay:

setTimeout(()=>console.log('Timer 1 finished'), 0);
setImmediate(()=>console.log('Immediate 1 finished'));

fs.readFile('test-file.txt', ()=>{
    console.log('I/O finished');
    setTimeout(()=>console.log('Timer 2 finished'), 3000);
    setImmediate(()=>console.log('Immediate 2 finished'));
    setTimeout(()=>console.log('Timer 3 finished'), 0);
    setImmediate(()=>console.log('Immediate 3 finished'));
});
console.log('Hello from the top level code')

output: 在此处输入图片说明

In the first cycle inside I/O everything executed, but because of the dealy Timer-2 executed inside its code in the second cycle.

Now Lets add nextTick(), and see how nodejs behaves:

setTimeout(()=>console.log('Timer 1 finished'), 0);
setImmediate(()=>console.log('Immediate 1 finished'));

fs.readFile('test-file.txt', ()=>{
    console.log('I/O finished');
    setTimeout(()=>console.log('Timer 2 finished'), 3000);
    setImmediate(()=>console.log('Immediate 2 finished'));
    setTimeout(()=>console.log('Timer 3 finished'), 0);
    setImmediate(()=>console.log('Immediate 3 finished'));
    process.nextTick(()=>console.log('Process Next Tick'));
});
console.log('Hello from the top level code')

Output:

在此处输入图片说明

Well, the first callback is executed is inside the process.NextTick(), as it is expected right? Because nextTicks callbacks stays in the microtask queue an they executed after each phase.

If you run this simple node code

console.log('starting')
setTimeout(()=>{
    console.log('0sec')
}, 0)
setTimeout(()=>{
    console.log('2sec')
}, 2000)
console.log('end')

What do you expect output to be? If its,

starting 
0sec
end 
2sec 

it's is wrong guess, we will get

starting 
end 
0sec 
2sec 

because node will never print code in event loop before exiting main()

So basically, First main() will go in stack, then console.log('starting ') so you will see it printed first, after that come setTimeout(()=>{console.log('0sec')}, 0) will go in a stack and then in nodeAPI (node uses multi-threads (lib written in c++) to execute setTimeout to finish, even tho above code is single thread code) after time is up it moves to the event loop, now node can't print it unless stack is not empty. So, next line ie setTimeout of 2sec will be first pushed to stack,then nodeAPI which will wait for 2 sec to finish, and then to even loop, in mean while next code line will be executed that is console.log('end') and so we see end msg before 0sec, because if nodes non blocking nature. After end code is over and hence main is poped out and its turn of event loop code to be executed that is first 0sec and after that 2sec msg will be printed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM