简体   繁体   English

在Clojure中,我如何使用换能器执行高频率的“频率”?

[英]In Clojure, how can I do a performant version of `frequencies` with transducers?

(Question credit: Fernando Abrao.) (问题来源:Fernando Abrao。)

I hear about the performance benefits of transducers in Clojure, but I'm not sure how to use them. 我听说Clojure中传感器的性能优势,但我不知道如何使用它们。

Say I have a qos/device-qos-range function that returns sequence of maps, some of which contain a decimal :samplevalue , like so: 假设我有一个qos/device-qos-range函数返回映射序列,其中一些包含十进制:samplevalue ,如下所示:

[
  { :samplevalue 1.3, ... },
  { :othervalue -27.7, ... },
  { :samplevalue 7.5, ... },
  { :samplevalue 1.9, ... },
]

I'd like to see how many :samplevalue s fall into each integer bin, like so: 我想看看有多少:samplevalue s落入每个整数bin,如下所示:

(frequencies
  (reduce #(if (not (nil? (:samplevalue %2)))
             (conj %1 (.intValue (:samplevalue %2))))
          []
          (qos/device-qos-range origem device qos alvo inicio fim)))

;; => {1 2, 7 1}

How can I turn this into a fast version with transducers that eliminates intermediate data structures (such as the one returned by reduce )? 如何将其转换为带有传感器的快速版本,以消除中间数据结构(例如reduce返回的结构)? Bonus points for code that can take advantage of multiple cores to do parallel processing. 可以利用多个内核进行并行处理的代码的加分点。

(Answer credit: Renzo Borgatti ( @reborg ).) (答案来源:Renzo Borgatti( @reborg )。)

First, let's set up some sample data, which we'll use for performance tests later. 首先,让我们设置一些示例数据,稍后我们将用于性能测试。 This vector contains 500k maps with the same key. 此向量包含具有相同键的500k映射。 Values are overlapping 1/5th of the time. 值在1/5时间重叠。

(def data 
 (mapv hash-map 
       (repeat :samplevalue) 
       (concat (range 1e5)
               (range 1e5)
               (range 1e5)
               (range 1e5)
               (range 1e5))))

Now let's do your transformation with transducers. 现在让我们用换能器进行转换。 Note that this solution is not parallel. 请注意,此解决方案不是并行的。 I shortened your .intValue to just int , which does the same thing. 我将你的.intValueint ,这也是同样的事情。 Also, conditionally fetching :samplevalue from each map can be shortened to just (keep :samplevalue sequence) , which is equivalent to (remove nil? (map :samplevalue sequence)) . 此外,有条件地获取:samplevalue每个映射的:samplevalue可以缩短为(keep :samplevalue sequence) ,这相当于(remove nil? (map :samplevalue sequence)) We'll use Criterium to benchmark. 我们将使用Criterium进行基准测试。

(require '[criterium.core :refer [quick-bench]])
(quick-bench
  (transduce
    (comp
      (keep :samplevalue)
      (map int))
    (completing #(assoc! %1 %2 (inc (get %1 %2 0))) persistent!)
    (transient {})
    data))
;; My execution time mean: 405 ms

Note that we're not calling frequencies as an external step this time. 请注意,这次我们不会将frequencies称为外部步骤。 Instead, we've woven it into the operation. 相反,我们把它编织到了操作中。 And just like what frequencies does, we've done the operations on a transient hashmap for extra performance. 就像frequencies一样,我们已经在瞬态散列图上完成了操作以获得额外的性能。 We do this by using a transient hashmap as the seed and completing the final value by calling persistent! 我们通过使用瞬态hashmap作为种子并通过调用persistent! completing最终值来实现这一点persistent! on it. 在上面。

We can make this parallel. 我们可以使这个并行。 For maximum performance, we use a mutable Java ConcurrentHashMap instead of an immutable Clojure data structure. 为了获得最佳性能,我们使用可变的Java ConcurrentHashMap而不是不可变的Clojure数据结构。

(require '[clojure.core.reducers :as r])
(import '[java.util HashMap Collections Map]
        'java.util.concurrent.atomic.AtomicInteger
        'java.util.concurrent.ConcurrentHashMap)

(quick-bench
  (let [concurrency-level (.availableProcessors (Runtime/getRuntime))
        m (ConcurrentHashMap. (quot (count data) 2) 0.75 concurrency-level)
        combinef (fn ([] m) ([_ _]))  ; just return `m` from the combine step
        rf (fn [^Map m k]
             (let [^AtomicInteger v (or (.get m k) (.putIfAbsent m k (AtomicInteger. 1)))]
               (when v (.incrementAndGet v))
               m))
        reducef ((comp (keep :samplevalue) (map int)) rf)]
    (r/fold combinef reducef data)
    (into {} m)))
;; My execution time mean: 70 ms

Here we use fold from the clojure.core.reducers library to achieve parallelism. 这里我们使用clojure.core.reducers库中的fold来实现并行性。 Note that in a parallel context any transducers one uses need to be stateless. 请注意,在并行环境中,任何使用的传感器都需要无状态。 Also note that a ConcurrentHashMap doesn't support using nil as a key or value; 另请注意, ConcurrentHashMap不支持将nil用作键或值; fortunately, we don't need to do that here. 幸运的是,我们不需要在这里这样做。

The output is converted into an immutable Clojure hashmap at the end. 输出最后转换为不可变的Clojure哈希映射。 You can remove that step and just use the ConcurrentHashMap instance for an additional speedup—on my machine, removing the into step makes the whole fold take about 26ms. 您可以删除步骤,而直接使用ConcurrentHashMap实例中的额外加速,在我的机器,去掉into一步使得整个fold大约需要26ms。

Edit 2017-11-20: User @clojuremostly correctly pointed out that an earlier version of this answer had the call to quick-bench inside the let block that initialized the concurrent hash map instance, which meant that the benchmark used the same instance for all of its runs. 编辑2017-11-20:用户@clojure最正确地指出这个答案的早期版本在let块中调用了quick-bench ,初始化了并发哈希映射实例,这意味着基准测试使用了相同的实例它的运行。 I moved the call to quick-bench to be outside the let block. 我把电话转移到了quick-bench ,以便在let区域之外。 It did not significantly affect the results. 它没有显着影响结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM