简体   繁体   English

Highland.js中的嵌套流操作

[英]Nested stream operations in Highland.js

I have a stream of directories from the readdirp module. 我有来自readdirp模块的目录流。

I want to:- 我想要:-

  • search for a file using a regex (eg README.* ) in each directory 在每个目录中使用正则表达式(例如README.* )搜索文件
  • read the first line of that file that does not start with a # 读取不以#开头的文件的第一行
  • print out each directory and this first non-heading line of the README in the directory. 打印出每个目录以及该目录中README文件的第一个非标题行。

I am trying to do this using streams and highland.js . 我正在尝试使用stream和highland.js做到这一点。

I am stuck trying to process a stream of all files inside each directory. 我被困在尝试处理每个目录中所有文件的流。

h = require 'highland'

dirStream = readdirp root: root, depth: 0, entryType: 'directories'

dirStream = h(dirStream)
  .filter (entry) -> entry.stat.isDirectory()
  .map (entry) ->

    # Search all files in the directory for README.
    fileStream = readdirp root: entry.fullPath, depth: 0, entryType: 'files', fileFilter: '!.DS_Store'
    fileStream = h(fileStream).filter (entry) -> /README\..*/.test entry.name
    fileStream.each (file) ->
      readmeStream = fs.createReadStream file
      _(readmeStream)
        .split()
        .takeUntil (line) -> not line.startsWith '#' and line isnt ''
        .last(1)
        .toArray (comment) ->
          # TODO: How do I access `comment` asynchronously to include in the return value of the map?

    return {name: entry.name, comment: comment}

It's best to consider Highland streams as immutable, and operations like filter and map returning new streams that depend on the old stream, rather than modifications of the old stream. 最好将Highland流视为不可变的,并且像filtermap这样的操作会返回依赖于旧流而不是修改旧流的新流。

Also, Highland methods are lazy: you should only call each or toArray when you absolutely need the data right now . 同样,Highland方法是惰性的:仅当您现在绝对需要数据时,才应调用eachtoArray

The standard way of asynchronously mapping a stream is flatMap . 异步映射流的标准方法是flatMap It's like map , but the function you give it should return a stream. 就像map ,但是您给它的函数应该返回一个流。 The stream you get from flatMap is the concatenation of all the returned streams. flatMap获得的流是所有返回的流的串联。 Because the new stream depends on all the old streams in order, it can be used to sequence asynchronous process. 因为新流按顺序依赖于所有旧流,所以它可用于对异步过程进行排序。

I'd modify your example to the following (clarified some variable names): 我将您的示例修改为以下内容(阐明了一些变量名称):

h = require 'highland'

readmeStream = h(readdirp root: root, depth: 0, entryType: 'directories')
  .filter (dir) -> dir.stat.isDirectory()
  .flatMap (dir) ->
    # Search all files in the directory for README.
    h(readdirp root: dir.fullPath, depth: 0, entryType: 'files', fileFilter: '!.DS_Store')
    .filter (file) -> /README\..*/.test file.name
    .flatMap (file) ->
      h(fs.createReadStream file.name)
        .split()
        .takeUntil (line) -> not line.startsWith '#' and line isnt ''
        .last(1)
        .map (comment) -> {name: file.name, comment}

Let's take a walk though the types in this code. 让我们来看一下这段代码中的类型。 First, note that flatMap has type (in Haskellish notation) Stream a → (a → Stream b) → Stream b , ie it takes a stream containing some things of type a , and a function expecting things of type a and returning streams containing b s, and returns a stream containing b s. 首先,请注意flatMap具有类型(用Haskellish表示法) Stream a → (a → Stream b) → Stream b ,即,它接受包含a类型a某些东西的流,以及一个期望a类型a东西并返回包含b流的函数。 s,并返回包含b的流。 It's standard for collection types (such as stream and array) to implement flatMap as concatenating the returned collections. 集合类型(例如流和数组)的标准是实现flatMap串联返回的集合。

h(readdirp root: root, depth: 0, entryType: 'directories')

Let's say this has type Stream Directory . 假设它的类型为Stream Directory The filter doesn't change the type, so the flatMap will be Stream Directory → (Directory → Stream b) → Stream b . filter器不会更改类型,因此flatMap将是Stream Directory → (Directory → Stream b) → Stream b We'll see what the function returns: 我们将看到函数返回的内容:

h(readdirp root: dir.fullPath, depth: 0, entryType: 'files', fileFilter: '!.DS_Store')

Call this a Stream File , so the second flatMap is Stream File → (File → Stream b) → Stream b . 将此称为Stream File ,因此第二个flatMapStream File → (File → Stream b) → Stream b

h(fs.createReadStream file.name)

This is a Stream String . 这是一个Stream String split , takeUntil and last don't change that, so what does the map do? splittakeUntillast都不会改变它,那么map做什么? map is very similar to flatMap : its type is Stream a → (a → b) → Stream b . mapflatMap非常相似:其类型为Stream a → (a → b) → Stream b In this case a is String and b is an object type {name : String, comment : String} . 在这种情况下, aStringb为对象类型{name : String, comment : String} Then map returns a stream of that object, which is what the overall flatMap function returns. 然后map返回该对象的流,这是整个flatMap函数返回的内容。 Step up, and b in the second flatMap is the object, so the first flatMap 's function also returns a stream of the object, so the entire stream is a Stream {name : String, comment : String} . 向上移动,第二个flatMap中的b是对象,因此第一个flatMap的函数还返回对象的流,因此整个流是Stream {name : String, comment : String}

Note that because of Highland's laziness, this doesn't actually start any streaming or processing. 请注意,由于Highland的懒惰,实际上并没有启动任何流或处理。 You need to use each or toArray to cause a thunk and start the pipeline. 您需要使用eachtoArray引起重thunk并启动管道。 In each , the callback will be called with your object. each ,回调将与您的对象一起调用。 Depending on what you want to do with the comments, it might be best to flatMap some more (if you're writing them to a file for example). 根据您要对注释执行的操作,最好再进行flatMap (例如,如果您将其写入文件中)。

Well, I didn't mean to write an essay. 好吧,我不是要写论文。 Hope this helps. 希望这可以帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM