[英]Nested stream operations in Highland.js
I have a stream of directories from the readdirp
module. 我有来自
readdirp
模块的目录流。
I want to:- 我想要:-
README.*
) in each directory README.*
)搜索文件 #
#
开头的文件的第一行 I am trying to do this using streams and highland.js . 我正在尝试使用stream和highland.js做到这一点。
I am stuck trying to process a stream of all files inside each directory. 我被困在尝试处理每个目录中所有文件的流。
h = require 'highland'
dirStream = readdirp root: root, depth: 0, entryType: 'directories'
dirStream = h(dirStream)
.filter (entry) -> entry.stat.isDirectory()
.map (entry) ->
# Search all files in the directory for README.
fileStream = readdirp root: entry.fullPath, depth: 0, entryType: 'files', fileFilter: '!.DS_Store'
fileStream = h(fileStream).filter (entry) -> /README\..*/.test entry.name
fileStream.each (file) ->
readmeStream = fs.createReadStream file
_(readmeStream)
.split()
.takeUntil (line) -> not line.startsWith '#' and line isnt ''
.last(1)
.toArray (comment) ->
# TODO: How do I access `comment` asynchronously to include in the return value of the map?
return {name: entry.name, comment: comment}
It's best to consider Highland streams as immutable, and operations like filter
and map
returning new streams that depend on the old stream, rather than modifications of the old stream. 最好将Highland流视为不可变的,并且像
filter
和map
这样的操作会返回依赖于旧流而不是修改旧流的新流。
Also, Highland methods are lazy: you should only call each
or toArray
when you absolutely need the data right now . 同样,Highland方法是惰性的:仅当您现在绝对需要数据时,才应调用
each
或toArray
。
The standard way of asynchronously mapping a stream is flatMap
. 异步映射流的标准方法是
flatMap
。 It's like map
, but the function you give it should return a stream. 就像
map
,但是您给它的函数应该返回一个流。 The stream you get from flatMap
is the concatenation of all the returned streams. 从
flatMap
获得的流是所有返回的流的串联。 Because the new stream depends on all the old streams in order, it can be used to sequence asynchronous process. 因为新流按顺序依赖于所有旧流,所以它可用于对异步过程进行排序。
I'd modify your example to the following (clarified some variable names): 我将您的示例修改为以下内容(阐明了一些变量名称):
h = require 'highland'
readmeStream = h(readdirp root: root, depth: 0, entryType: 'directories')
.filter (dir) -> dir.stat.isDirectory()
.flatMap (dir) ->
# Search all files in the directory for README.
h(readdirp root: dir.fullPath, depth: 0, entryType: 'files', fileFilter: '!.DS_Store')
.filter (file) -> /README\..*/.test file.name
.flatMap (file) ->
h(fs.createReadStream file.name)
.split()
.takeUntil (line) -> not line.startsWith '#' and line isnt ''
.last(1)
.map (comment) -> {name: file.name, comment}
Let's take a walk though the types in this code. 让我们来看一下这段代码中的类型。 First, note that
flatMap
has type (in Haskellish notation) Stream a → (a → Stream b) → Stream b
, ie it takes a stream containing some things of type a
, and a function expecting things of type a
and returning streams containing b
s, and returns a stream containing b
s. 首先,请注意
flatMap
具有类型(用Haskellish表示法) Stream a → (a → Stream b) → Stream b
,即,它接受包含a类型a
某些东西的流,以及一个期望a类型a
东西并返回包含b
流的函数。 s,并返回包含b
的流。 It's standard for collection types (such as stream and array) to implement flatMap
as concatenating the returned collections. 集合类型(例如流和数组)的标准是实现
flatMap
串联返回的集合。
h(readdirp root: root, depth: 0, entryType: 'directories')
Let's say this has type Stream Directory
. 假设它的类型为
Stream Directory
。 The filter
doesn't change the type, so the flatMap
will be Stream Directory → (Directory → Stream b) → Stream b
. filter
器不会更改类型,因此flatMap
将是Stream Directory → (Directory → Stream b) → Stream b
。 We'll see what the function returns: 我们将看到函数返回的内容:
h(readdirp root: dir.fullPath, depth: 0, entryType: 'files', fileFilter: '!.DS_Store')
Call this a Stream File
, so the second flatMap
is Stream File → (File → Stream b) → Stream b
. 将此称为
Stream File
,因此第二个flatMap
是Stream File → (File → Stream b) → Stream b
。
h(fs.createReadStream file.name)
This is a Stream String
. 这是一个
Stream String
。 split
, takeUntil
and last
don't change that, so what does the map
do? split
, takeUntil
和last
都不会改变它,那么map
做什么? map
is very similar to flatMap
: its type is Stream a → (a → b) → Stream b
. map
与flatMap
非常相似:其类型为Stream a → (a → b) → Stream b
。 In this case a
is String
and b
is an object type {name : String, comment : String}
. 在这种情况下,
a
为String
, b
为对象类型{name : String, comment : String}
。 Then map
returns a stream of that object, which is what the overall flatMap
function returns. 然后
map
返回该对象的流,这是整个flatMap
函数返回的内容。 Step up, and b
in the second flatMap
is the object, so the first flatMap
's function also returns a stream of the object, so the entire stream is a Stream {name : String, comment : String}
. 向上移动,第二个
flatMap
中的b
是对象,因此第一个flatMap
的函数还返回对象的流,因此整个流是Stream {name : String, comment : String}
。
Note that because of Highland's laziness, this doesn't actually start any streaming or processing. 请注意,由于Highland的懒惰,实际上并没有启动任何流或处理。 You need to use
each
or toArray
to cause a thunk
and start the pipeline. 您需要使用
each
或toArray
引起重thunk
并启动管道。 In each
, the callback will be called with your object. 在
each
,回调将与您的对象一起调用。 Depending on what you want to do with the comments, it might be best to flatMap
some more (if you're writing them to a file for example). 根据您要对注释执行的操作,最好再进行
flatMap
(例如,如果您将其写入文件中)。
Well, I didn't mean to write an essay. 好吧,我不是要写论文。 Hope this helps.
希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.