简体   繁体   English

Immutable.js或Lazy.js执行捷径融合吗?

[英]Do Immutable.js or Lazy.js perform short-cut fusion?

First, let me define what is short-cut fusion for those of you who don't know. 首先,让我为那些不知道的人定义什么是捷径融合 Consider the following array transformation in JavaScript: 在JavaScript中考虑以下数组转换:

 var a = [1,2,3,4,5].map(square).map(increment); console.log(a); function square(x) { return x * x; } function increment(x) { return x + 1; } 

Here we have an array, [1,2,3,4,5] , whose elements are first squared, [1,4,9,16,25] , and then incremented [2,5,10,17,26] . 这里我们有一个数组, [1,2,3,4,5] ,其元素是第一个平方, [1,4,9,16,25] ,然后递增[2,5,10,17,26] Hence, although we don't need the intermediate array [1,4,9,16,25] , we still create it. 因此,虽然我们不需要中间数组[1,4,9,16,25] ,但我们仍然创建它。

Short-cut fusion is an optimization technique which can get rid of intermediate data structures by merging some functions calls into one. 捷径融合是一种优化技术,它可以通过将一些函数调用合并为一个来消除中间数据结构。 For example, short-cut fusion can be applied to the above code to produce: 例如,可以将快捷融合应用于上述代码以产生:

 var a = [1,2,3,4,5].map(compose(square, increment)); console.log(a); function square(x) { return x * x; } function increment(x) { return x + 1; } function compose(g, f) { return function (x) { return f(g(x)); }; } 

As you can see, the two separate map calls have been fused into a single map call by composing the square and increment functions. 如您所见,通过组合squareincrement函数,将两个单独的map调用融合到一个map调用中。 Hence the intermediate array is not created. 因此,不创建中间阵列。


Now, I understand that libraries like Immutable.js and Lazy.js emulate lazy evaluation in JavaScript. 现在,我理解像Immutable.jsLazy.js这样的库在JavaScript中模拟延迟评估。 Lazy evaluation means that results are only computed when required. 延迟评估意味着仅在需要时计算结果。

For example, consider the above code. 例如,考虑上面的代码。 Although we square and increment each element of the array, yet we may not need all the results. 虽然我们对数组的每个元素进行squareincrement ,但我们可能不需要所有结果。

Suppose we only want the first 3 results. 假设我们只想要前3个结果。 Using Immutable.js or Lazy.js we can get the first 3 results, [2,5,10] , without calculating the last 2 results, [17,26] , because they are not needed. 使用Immutable.js或Lazy.js我们可以获得前3个结果[2,5,10] ,而不计算最后2个结果, [17,26] ,因为它们不是必需的。

However, lazy evaluation just delays the calculation of results until required. 但是,延迟评估只会延迟结果的计算,直到需要。 It does not remove intermediate data structures by fusing functions. 它不会通过融合函数来删除中间数据结构。

To make this point clear, consider the following code which emulates lazy evaluation: 为了清楚地说明这一点,请考虑以下代码来模拟延迟评估:

 var List = defclass({ constructor: function (head, tail) { if (typeof head !== "function" || head.length > 0) Object.defineProperty(this, "head", { value: head }); else Object.defineProperty(this, "head", { get: head }); if (typeof tail !== "function" || tail.length > 0) Object.defineProperty(this, "tail", { value: tail }); else Object.defineProperty(this, "tail", { get: tail }); }, map: function (f) { var l = this; if (l === nil) return nil; return cons(function () { return f(l.head); }, function () { return l.tail.map(f); }); }, take: function (n) { var l = this; if (l === nil || n === 0) return nil; return cons(function () { return l.head; }, function () { return l.tail.take(n - 1); }); }, mapSeq: function (f) { var l = this; if (l === nil) return nil; return cons(f(l.head), l.tail.mapSeq(f)); } }); var nil = Object.create(List.prototype); list([1,2,3,4,5]) .map(trace(square)) .map(trace(increment)) .take(3) .mapSeq(log); function cons(head, tail) { return new List(head, tail); } function list(a) { return toList(a, a.length, 0); } function toList(a, length, i) { if (i >= length) return nil; return cons(a[i], function () { return toList(a, length, i + 1); }); } function square(x) { return x * x; } function increment(x) { return x + 1; } function log(a) { console.log(a); } function trace(f) { return function () { var result = f.apply(this, arguments); console.log(f.name, JSON.stringify([...arguments]), result); return result; }; } function defclass(prototype) { var constructor = prototype.constructor; constructor.prototype = prototype; return constructor; } 

As you can see, the function calls are interleaved and only the first three elements of the array are processed, proving that the results are indeed computed lazily: 如您所见,函数调用是交错的,只处理数组的前三个元素,证明结果确实是懒惰地计算的:

square [1] 1
increment [1] 2
2
square [2] 4
increment [4] 5
5
square [3] 9
increment [9] 10
10

If lazy evaluation is not used then the result would be: 如果不使用延迟评估,那么结果将是:

square [1] 1
square [2] 4
square [3] 9
square [4] 16
square [5] 25
increment [1] 2
increment [4] 5
increment [9] 10
increment [16] 17
increment [25] 26
2
5
10

However, if you see the source code then each function list , map , take and mapSeq returns an intermediate List data structure. 但是,如果您看到源代码,那么每个函数listmaptakemapSeq返回一个中间的List数据结构。 No short-cut fusion is performed. 没有进行捷径融合。


This brings me to my main question: do libraries like Immutable.js and Lazy.js perform short-cut fusion? 这让我想到了我的主要问题:像Immutable.js和Lazy.js这样的库是否会执行捷径融合?

The reason I ask is because according to the documentation, they “apparently” do. 我问的原因是因为根据文档,他们“显然”做了。 However, I am skeptical. 但是,我持怀疑态度。 I have my doubts whether they actually perform short-cut fusion. 我怀疑他们是否真的进行了短切融合。

For example, this is taken from the README.md file of Immutable.js: 例如,这取自Immutable.js的README.md文件:

Immutable also provides a lazy Seq , allowing efficient chaining of collection methods like map and filter without creating intermediate representations. Immutable还提供了一个惰性Seq ,允许有效链接集合方法,如mapfilter而无需创建中间表示。 Create some Seq with Range and Repeat . RangeRepeat创建一些Seq

So the developers of Immutable.js claim that their Seq data structure allows efficient chaining of collection methods like map and filter without creating intermediate representations (ie they perform short-cut fusion). 因此,Immutable.js的开发人员声称他们的Seq数据结构允许有效链接收集方法,如mapfilter 而不创建中间表示 (即,它们执行快捷融合)。

However, I don't see them doing so in their code anywhere. 但是,我没有看到他们在任何地方的代码中这样做。 Perhaps I can't find it because they are using ES6 and my eyes aren't all too familiar with ES6 syntax. 也许我找不到它,因为他们使用的是ES6,我的眼睛并不太熟悉ES6语法。

Furthermore, in their documentation for Lazy Seq they mention: 此外,在他们的Lazy Seq文档中,他们提到:

Seq describes a lazy operation, allowing them to efficiently chain use of all the Iterable methods (such as map and filter ). Seq描述了一个惰性操作,允许它们有效地链接使用所有Iterable方法(例如mapfilter )。

Seq is immutable — Once a Seq is created, it cannot be changed, appended to, rearranged or otherwise modified. Seq是不可变的 - 一旦创建了Seq,它就不能被改变,附加,重新排列或以其他方式修改。 Instead, any mutative method called on a Seq will return a new Seq. 相反,在Seq上调用的任何变异方法都将返回一个新的Seq。

Seq is lazy — Seq does as little work as necessary to respond to any method call. Seq是懒惰的 - Seq尽可能少地响应任何方法调用。

So it is established that Seq is indeed lazy. 因此确定Seq确实是懒惰的。 However, there are no examples to show that intermediate representations are indeed not created (which they claim to be doing). 但是,没有示例表明确实没有创建中间表示 (他们声称正在进行中间表示 )。


Moving on to Lazy.js we have the same situation. 继续使用Lazy.js,我们也有同样的情况。 Thankfully, Daniel Tao wrote a blog post on how Lazy.js works, in which he mentions that at its heart Lazy.js simply does function composition. 值得庆幸的是,Daniel Tao撰写了一篇关于Lazy.js如何工作的博客文章 ,其中他提到Lazy.js的核心功能组合。 He gives the following example: 他给出了以下例子:

 Lazy.range(1, 1000) .map(square) .filter(multipleOf3) .take(10) .each(log); function square(x) { return x * x; } function multipleOf3(x) { return x % 3 === 0; } function log(a) { console.log(a); } 
 <script src="https://rawgit.com/dtao/lazy.js/master/lazy.min.js"></script> 

Here the map , filter and take functions produce intermediate MappedSequence , FilteredSequence and TakeSequence objects. 这里mapfiltertake函数生成中间MappedSequenceFilteredSequenceTakeSequence对象。 These Sequence objects are essentially iterators, which eliminate the need of intermediate arrays. 这些Sequence对象本质上是迭代器,不需要中间数组。

However, from what I understand, there is still no short-cut fusion taking place. 但是,据我所知,仍然没有发生捷径融合。 The intermediate array structures are simply replaced with intermediate Sequence structures which are not fused. 中间阵列结构简单地用未融合的中间Sequence结构代替。

I could be wrong, but I believe that expressions like Lazy(array).map(f).map(g) produce two separate MappedSequence objects in which the first MappedSequence object feeds its values to the second one, instead of the second one replacing the first one by doing the job of both (via function composition). 我可能错了,但我相信像Lazy(array).map(f).map(g)这样的表达式产生两个独立的MappedSequence对象,其中第一个MappedSequence对象将其值提供给第二个,而不是第二个替换第一个通过做两个工作(通过功能组合)。

TLDR: Do Immutable.js and Lazy.js indeed perform short-cut fusion? TLDR: Immutable.js和Lazy.js确实执行捷径融合吗? As far as I know they get rid of intermediate arrays by emulating lazy evaluation via sequence objects (ie iterators). 据我所知,他们通过序列对象(即迭代器)模拟延迟评估来摆脱中间数组。 However, I believe that these iterators are chained: one iterator feeding its values lazily to the next. 但是,我相信这些迭代器是链接的:一个迭代器懒洋洋地将它的值提供给下一个迭代器。 They are not merged into a single iterator. 它们不会合并为单个迭代器。 Hence they do not “eliminate intermediate representations“. 因此,他们不“消除中间表征”。 They only transform arrays into constant space sequence objects. 它们只将数组转换为常量空间序列对象。

I'm the author of Immutable.js (and a fan of Lazy.js). 我是Immutable.js的作者(也是Lazy.js的粉丝)。

Does Lazy.js and Immutable.js's Seq use short-cut fusion? Lazy.js和Immutable.js的Seq是否使用捷径融合? No, not exactly. 不,不完全是。 But they do remove intermediate representation of operation results. 但他们确实删除了操作结果的中间表示。

Short-cut fusion is a code compilation/transpilation technique. 捷径融合是一种代码编译/转换技术。 Your example is a good one: 你的榜样很好:

var a = [1,2,3,4,5].map(square).map(increment);

Transpiled: Transpiled:

var a = [1,2,3,4,5].map(compose(square, increment));

Lazy.js and Immutable.js are not transpilers and will not re-write code. Lazy.js和Immutable.js不是转换器,也不会重写代码。 They are runtime libraries. 它们是运行时库。 So instead of short-cut fusion (a compiler technique) they use iterable composition (a runtime technique). 因此,它们使用可迭代组合(运行时技术)而不是快捷融合(编译器技术)。

You answer this in your TLDR: 你在TLDR中回答这个问题:

As far as I know they get rid of intermediate arrays by emulating lazy evaluation via sequence objects (ie iterators). 据我所知,他们通过序列对象(即迭代器)模拟延迟评估来摆脱中间数组。 However, I believe that these iterators are chained: one iterator feeding its values lazily to the next. 但是,我相信这些迭代器是链接的:一个迭代器懒洋洋地将它的值提供给下一个迭代器。 They are not merged into a single iterator. 它们不会合并为单个迭代器。 Hence they do not "eliminate intermediate representations". 因此,他们不“消除中间表征”。 They only transform arrays into constant space sequence objects. 它们只将数组转换为常量空间序列对象。

That is exactly right. 这是完全正确的。

Let's unpack: 我们打开包装:

Arrays store intermediate results when chaining: 链接时,数组存储中间结果:

var a = [1,2,3,4,5];
var b = a.map(square); // b: [1,4,6,8,10] created in O(n)
var c = b.map(increment); // c: [2,5,7,9,11] created in O(n)

Short-cut fusion transpilation creates intermediate functions: 快捷融合转化创造了中间功能:

var a = [1,2,3,4,5];
var f = compose(square, increment); // f: Function created in O(1)
var c = a.map(f); // c: [2,5,7,9,11] created in O(n)

Iterable composition creates intermediate iterables: 可迭代组合创建中间可迭代:

var a = [1,2,3,4,5];
var i = lazyMap(a, square); // i: Iterable created in O(1)
var j = lazyMap(i, increment); // j: Iterable created in O(1)
var c = Array.from(j); // c: [2,5,7,9,11] created in O(n)

Note that using iterable composition, we have not created a store of intermediate results. 请注意,使用可迭代合成,我们还没有创建中间结果的存储。 When these libraries say they do not create intermediate representations - what they mean is exactly what is described in this example. 当这些库表示他们不创建中间表示时 - 他们的意思正是这个例子中描述的内容。 No data structure is created holding the values [1,4,6,8,10] . 没有创建包含值[1,4,6,8,10]数据结构。

However, of course some intermediate representation is made. 但是,当然会进行一些中间表示。 Each "lazy" operation must return something. 每个“懒惰”操作必须返回一些东西。 They return an iterable. 他们返回一个可迭代的。 Creating these is extremely cheap and not related to the size of the data being operated on. 创建这些非常便宜并且与正在操作的数据的大小无关。 Note that in short-cut fusion transpilation, an intermediate representation is also made. 注意,在短切融合转录中,还进行了中间表示。 The result of compose is a new function. compose的结果是一个新功能。 Functional composition (hand-written or the result of a short-cut fusion compiler) is very related to Iterable composition. 功能组合(手写或快捷融合编译器的结果)与可重复组合非常相关。

The goal of removing intermediate representations is performance, especially regarding memory. 删除中间表示的目标是性能,特别是关于内存。 Iterable composition is a powerful way to implement this and does not require the overhead that parsing and rewriting code of an optimizing compiler which would be out of place in a runtime library. 可迭代组合是一种实现此功能的强大方法,并且不需要解析和重写优化编译器代码的开销,这些代码在运行时库中是不合适的。


Appx: APPX:

This is what a simple implementation of lazyMap might look like: 这就是lazyMap的简单实现:

function lazyMap(iterable, mapper) {
  return {
    "@@iterator": function() {
      var iterator = iterable["@@iterator"]();
      return {
        next: function() {
          var step = iterator.next();
          return step.done ? step : { done: false, value: mapper(step.value) }
        }
      };
    }
  };
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM