简体   繁体   English

在Clojure与ClojureScript中了解core.async合并

[英]Understanding core.async merge, in Clojure vs ClojureScript

I'm experimenting with core.async on Clojure and ClojureScript, to try and understand how merge works. 我与实验core.async上的Clojure和ClojureScript,试图了解如何merge工作。 In particular, whether merge makes any values put on input channels available to take immediately on the merged channel. 特别是, merge是否使输入通道上的任何值都可立即用于合并通道。

I have the following code: 我有以下代码:

(ns async-merge-example.core
  (:require
   #?(:clj [clojure.core.async :as async] :cljs [cljs.core.async :as async])
   [async-merge-example.exec :as exec]))

(defn async-fn-timeout
  [v]
  (async/go
    (async/<! (async/timeout (rand-int 5000)))
    v))

(defn async-fn-exec
  [v]
  (exec/exec "sh" "-c" (str "sleep " (rand-int 5) "; echo " v ";")))

(defn merge-and-print-results
  [seq async-fn]
  (let [chans (async/merge (map async-fn seq))]
    (async/go
      (while (when-let [v (async/<! chans)]
               (prn v)
               v)))))

When I try async-fn-timeout with a large-ish seq : 当我尝试使用带有较大seq async-fn-timeout时:

(merge-and-print-results (range 20) async-fn-timeout)

For both Clojure and ClojureScript I get the result I expect, as in, results start getting printed pretty much immediately, with the expected delays. 对于Clojure ClojureScript,我都得到了我期望的结果,因为结果会立即开始打印,并带有预期的延迟。

However, when I try async-fn-exec with the same seq : 但是,当我尝试使用相同seq async-fn-exec时:

(merge-and-print-results (range 20) async-fn-exec)

For ClojureScript, I get the result I expect, as in results start getting printed pretty much immediately, with the expected delays. 对于ClojureScript,我得到了我期望的结果,因为结果会立即开始打印,并带有预期的延迟。 However for Clojure even though the sh processes are executed concurrently (subject to the size of the core.async thread pool), the results appear to be initially delayed, then mostly printed all at once! 但是对于Clojure而言,即使sh进程是同时执行的(取决于core.async线程池的大小),结果似乎最初是延迟的,然后几乎一次全部打印! I can make this difference more obvious by increasing the size of the seq eg (range 40) 我可以通过增加seq的大小来使这种区别更加明显,例如(range 40)

Since the results for async-fn-timeout are as expected on both Clojure and ClojureScript, the finger is pointed at the differences between the Clojure and ClojureScript implementation for exec .. 由于async-fn-timeout的结果在Clojure和ClojureScript上都是预期的,因此,将矛头指向exec的Clojure和ClojureScript实现之间的差异。

But I don't know why this difference would cause this issue? 但是我不知道为什么这种差异会导致此问题?

Notes: 笔记:

  • These observations were made in WSL on Windows 10 这些观察是在Windows 10的WSL中​​进行的
  • The source code for async-merge-example.exec is below 下面是async-merge-example.exec的源代码
  • In exec , the implementation differs for Clojure and ClojureScript due to differences between Clojure/Java and ClojureScript/NodeJS. exec ,由于Clojure / Java和ClojureScript / NodeJS之间的差异,Clojure和ClojureScript的实现有所不同。
(ns async-merge-example.exec
  (:require
   #?(:clj [clojure.core.async :as async] :cljs [cljs.core.async :as async])))

; cljs implementation based on https://gist.github.com/frankhenderson/d60471e64faec9e2158c

; clj implementation based on https://stackoverflow.com/questions/45292625/how-to-perform-non-blocking-reading-stdout-from-a-subprocess-in-clojure

#?(:cljs (def spawn (.-spawn (js/require "child_process"))))

#?(:cljs
   (defn exec-chan
     "spawns a child process for cmd with args. routes stdout, stderr, and
      the exit code to a channel. returns the channel immediately."
     [cmd args]
     (let [c (async/chan), p (spawn cmd (if args (clj->js args) (clj->js [])))]
       (.on (.-stdout p) "data"  #(async/put! c [:out  (str %)]))
       (.on (.-stderr p) "data"  #(async/put! c [:err  (str %)]))
       (.on p            "close" #(async/put! c [:exit (str %)]))
       c)))

#?(:clj
   (defn exec-chan
     "spawns a child process for cmd with args. routes stdout, stderr, and
      the exit code to a channel. returns the channel immediately."
     [cmd args]
     (let [c (async/chan)]
       (async/go
         (let [builder (ProcessBuilder. (into-array String (cons cmd (map str args))))
               process (.start builder)]
           (with-open [reader (clojure.java.io/reader (.getInputStream process))
                       err-reader (clojure.java.io/reader (.getErrorStream process))]
             (loop []
               (let [line (.readLine ^java.io.BufferedReader reader)
                     err (.readLine ^java.io.BufferedReader err-reader)]
                 (if (or line err)
                   (do (when line (async/>! c [:out line]))
                       (when err (async/>! c [:err err]))
                       (recur))
                   (do
                     (.waitFor process)
                     (async/>! c [:exit (.exitValue process)]))))))))
       c)))

(defn exec
  "executes cmd with args. returns a channel immediately which
   will eventually receive a result map of 
   {:out [stdout-lines] :err [stderr-lines] :exit [exit-code]}"
  [cmd & args]
  (let [c (exec-chan cmd args)]
    (async/go (loop [output (async/<! c) result {}]
                (if (= :exit (first output))
                  (assoc result :exit (second output))
                  (recur (async/<! c) (update result (first output) #(conj (or % []) (second output)))))))))

Your Clojure implementation uses blocking IO in a single thread. 您的Clojure实现在单个线程中使用阻塞IO。 You are first reading from stdout and then stderr in a loop. 您首先从stdout中读取,然后在循环中读取stderr。 Both do a blocking readLine so they will only return once they actually finished reading a line. 两者都执行阻塞的readLine因此它们仅在实际完成读取一行后才返回。 So unless your process creates the same amount of output to stdout and stderr one stream will end up blocking the other one. 因此,除非您的进程向stdout和stderr创建相同数量的输出,否则一个流最终将阻塞另一个流。

Once the process is finished the readLine will no longer block and just return nil once the buffer is empty. 一旦该过程完成, readLine将不再阻塞,并且在缓冲区为空时仅返回nil So the loop just finishes reading the buffered output and then finally completes explaining the "all at once" messages. 因此,循环仅完成读取缓冲的输出,然后最终完成对“所有一次”消息的解释。

You'll probably want to start a second thread that deals reading from stderr. 您可能需要启动第二个线程来处理从stderr读取的内容。

node does not do blocking IO so everything happens async by default and one stream doesn't block the other. node不会阻止IO,因此默认情况下所有操作都是异步的,并且一个流不会阻止另一个。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM