简体   繁体   English

为什么不对Racket中的所有对象执行序列功能

[英]Why not sequence- functions for all in Racket

在Racket中使用sequence-length,sequence-ref,sequence-map等而不是列表(长度list-ref等),字符串(string-length,string-ref等),向量等使用不同的功能,是否有任何缺点?

Performance. 性能。

Consider this tiny benchmark: 考虑一下这个微小的基准:

#lang racket/base

(require racket/sequence)

(define len 10000)
(define vec (make-vector len))

(collect-garbage)
(collect-garbage)
(collect-garbage)

(time (void (for/list ([i (in-range len)])
              (vector-ref vec i))))

(collect-garbage)
(collect-garbage)
(collect-garbage)

(time (void (for/list ([i (in-range len)])
              (sequence-ref vec i))))

This is the output on my machine: 这是我机器上的输出:

; vectors (vector-ref vs sequence-ref)
cpu time: 1 real time: 1 gc time: 0
cpu time: 2082 real time: 2081 gc time: 0

Yes, that's a difference of 3 orders of magnitude . 是的,相差三个数量级

Why? 为什么? Well, racket/sequence is not a terribly “smart” API, and even though vectors are random access, sequence-ref is not. 好吧, racket/sequence并不是一个非常“智能”的API,即使矢量是随机访问的,但sequence-ref却不是。 Combined with the ability of the Racket optimizer to heavily optimize primitive operations, the sequence API is a pretty poor interface. 结合Racket优化器对原始操作进行大量优化的能力,序列API的界面非常差。

Of course, this is a little unfair, because vectors are random access while things like lists are not. 当然,这有点不公平,因为向量是随机访问,而列表等则不是。 However, performing the exact same test as the one above but using lists instead of vectors still yields a pretty grim result: 但是,执行与上述测试完全相同的测试,但是使用列表而不是向量仍然会产生非常糟糕的结果:

; lists (list-ref vs sequence-ref)
cpu time: 113 real time: 113 gc time: 0
cpu time: 1733 real time: 1732 gc time: 0

The sequence API is slow , mostly because of a high level of indirection. 序列API 速度 ,主要是因为间接级别高。

Now, performance alone is not a reason to reject an API outright, since there are concrete advantages to working at a higher level of abstraction. 现在,仅凭性能就不能成为完全拒绝API的理由,因为在更高级别的抽象上工作具有具体优势。 That said, I think the sequence API is not a good abstraction, because it: 就是说,我认为序列API并不是很好的抽象,因为它是:

  1. …is needlessly stateful in its implementation, which puts an unnecessary burden on implementors of the interface. …在其实现中毫无状态,这给接口的实现者带来了不必要的负担。

  2. …does not accommodate things that do not resemble lists, such as random-access vectors or hash tables. …不容纳与列表不相似的内容,例如随机访问向量或哈希表。

If you want to work with a higher level API, one possible option is to use the collections package , which attempts to provide an API similar to racket/sequence , but accommodating more kinds of data structures and also having a more complete set of functions. 如果您想使用更高级别的API,则一个可能的选择是使用collections ,该试图提供类似于racket/sequence的API,但可以容纳更多种类的数据结构,并具有更完整的功能集。 Disclaimer: I am the author of the collections package. 免责声明:我是collections包的作者。

Given the above benchmark once more, the performance is still worse than using the underlying functions directly, but it's at least a bit more manageable: 再次给出上述基准,性能仍然比直接使用基础功能差,但至少更易于管理:

; vectors (vector-ref vs ref)
cpu time: 2 real time: 1 gc time: 0
cpu time: 97 real time: 98 gc time: 10

; lists (list-ref vs ref)
cpu time: 104 real time: 103 gc time: 0
cpu time: 481 real time: 482 gc time: 0

Whether or not you can afford the overhead depends on what exactly you're doing, and it's up to you to make the call for yourself. 您是否负担得起间接费用取决于您究竟在做什么,而这取决于您自己进行呼叫。 The specialized operations will always be at least somewhat faster than the ones that defer to them as long as some sort of dynamic dispatch is being performed. 只要执行某种动态调度,这些专用操作将始终至少比遵从它们的操作要快一些。 As always, remember the rule of performance optimization: don't guess, measure. 与往常一样,请记住性能优化的规则:不要猜测,不要测量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM