简体   繁体   中英

How V8 optimise code using hidden classes and inline caching

Recently I came across the concept of hidden classes and inline caching used by V8 to optimise js code. Cool.

I understand that objects are represented as hidden classes internally. And two objects may have same properties but different hidden classes (depending upon the order in which properties are assigned).

Also V8 uses inline caching concept to directly check offset to access properties of object rather than using object's hidden class to determine offsets.

Code -

function Point(x, y) {
    this.x = x;
    this.y = y;
}

function processPoint(point) {
    // console.log(point.x, point.y, point.a, point.b);
    // let x = point;
}

function main() {
    let p1 = new Point(1, 1);
    let p2 = new Point(1, 1);
    let p3 = new Point(1, 1);
    const N = 300000000;
    p1.a = 1;
    p1.b = 1;
    p2.b = 1;
    p2.a = 1;
    p3.a = 1;
    p3.b = 1;
    let start_1 = new Date();
    for(let i = 0; i< N; i++ ) {
        if (i%4 != 0) {
            processPoint(p1);
        } else {
            processPoint(p2)
        }
    }
    let end_1 = new Date();
    let t1 = (end_1 - start_1);
    let start_2 = new Date();
    for(let i = 0; i< N; i++ ) {
        if (i%4 != 0) {
            processPoint(p1);
        } else {
            processPoint(p1)
        }
    }
    let end_2 = new Date();
    let t2 = (end_2 - start_2);
    let start_3 = new Date();
    for(let i = 0; i< N; i++ ) {
        if (i%4 != 0) {
            processPoint(p1);
        } else {
            processPoint(p3)
        }
    }
    let end_3 = new Date();
    let t3 = (end_3 - start_3);
    console.log(t1, t2, t3);
}

(function(){
    main();
})();

I was expecting results to be like t1 > (t2 = t3) because :

first loop : V8 will try to optimise after running twice but it will soon encounter different hidden class so it will de optimise.

second loop : same object is called all the time so inline caching can be used.

third loop : same as second loop because hidden classes are same.

But results are not satisfying. I got (and similar results running again and again) -

3553 4805 4556

Questions :

  1. Why results were not as expected? Where did my assumptions go wrong?

  2. How can I change this code to demonstrate hidden classes and inline caching performance improvements?

  3. Did I get it all wrong from the starting?

  4. Are hidden classes present just for memory efficiency by letting objects share them?

  5. Any other sites with some simple examples of performance improvements?

I am using node 8.9.4 for testing. Thanks in advance.

Sources :

  1. https://blog.sessionstack.com/how-javascript-works-inside-the-v8-engine-5-tips-on-how-to-write-optimized-code-ac089e62b12e

  2. https://draft.li/blog/2016/12/22/javascript-engines-hidden-classes/

  3. https://richardartoul.github.io/jekyll/update/2015/04/26/hidden-classes.html

and many more..

V8 developer here. The summary is: Microbenchmarking is hard, don't do it.

First off, with your code as posted, I'm seeing 380 380 380 as the output, which is expected, because function processPoint is empty, so all loops do the same work (ie, no work) no matter which point object you select.

Measuring the performance difference between monomorphic and 2-way polymorphic inline caches is difficult, because it is not large, so you have to be very careful about what else your benchmark is doing. console.log , for example, is so slow that it'll shadow everything else.

You'll also have to be careful about the effects of inlining. When your benchmark has many iterations, the code will get optimized (after running waaaay more than twice), and the optimizing compiler will (to some extent) inline functions, which can allow subsequent optimizations (specifically: eliminating various things) and thereby can significantly change what you're measuring. Writing meaningful microbenchmarks is hard; you won't get around inspecting generated assembly and/or knowing quite a bit about the implementation details of the JavaScript engine you're investigating.

Another thing to keep in mind is where inline caches are, and what state they'll have over time. Disregarding inlining, a function like processPoint doesn't know or care where it's called from. Once its inline caches are polymorphic, they'll remain polymorphic, even if later on in your benchmark (in this case, in the second and third loop) the types stabilize.

Yet another thing to keep in mind when trying to isolate effects is that long-running functions will get compiled in the background while they run, and will then at some point be replaced on the stack ("OSR"), which adds all sorts of noise to your measurements. When you invoke them with different loop lengths for warmup, they'll still get compiled in the background however, and there's no way to reliably wait for that background job. You could resort to command-line flags intended for development, but then you wouldn't be measuring regular behavior any more.

Anyhow, the following is an attempt to craft a test similar to yours that produces plausible results (about 100 180 280 on my machine):

function Point() {}

// These three functions are identical, but they will be called with different
// inputs and hence collect different type feedback:
function processPointMonomorphic(N, point) {
  let sum = 0;
  for (let i = 0; i < N; i++) {
    sum += point.a;
  }
  return sum;
}
function processPointPolymorphic(N, point) {
  let sum = 0;
  for (let i = 0; i < N; i++) {
    sum += point.a;
  }
  return sum;
}
function processPointGeneric(N, point) {
  let sum = 0;
  for (let i = 0; i < N; i++) {
    sum += point.a;
  }
  return sum;
}

let p1 = new Point();
let p2 = new Point();
let p3 = new Point();
let p4 = new Point();

const warmup = 12000;
const N = 100000000;
let sum = 0;
p1.a = 1;
p2.b = 1;
p2.a = 1;
p3.c = 1;
p3.b = 1;
p3.a = 1;
p4.d = 1;
p4.c = 1;
p4.b = 1;
p4.a = 1;
processPointMonomorphic(warmup, p1);
processPointMonomorphic(1, p1);
let start_1 = Date.now();
sum += processPointMonomorphic(N, p1);
let t1 = Date.now() - start_1;

processPointPolymorphic(2, p1);
processPointPolymorphic(2, p2);
processPointPolymorphic(2, p3);
processPointPolymorphic(warmup, p4);
processPointPolymorphic(1, p4);
let start_2 = Date.now();
sum += processPointPolymorphic(N, p1);
let t2 = Date.now() - start_2;

processPointGeneric(warmup, 1);
processPointGeneric(1, 1);
let start_3 = Date.now();
sum += processPointGeneric(N, p1);
let t3 = Date.now() - start_3;
console.log(t1, t2, t3);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM