简体   繁体   English

如何将enrich-my-library模式应用于Scala集合?

[英]How do I apply the enrich-my-library pattern to Scala collections?

One of the most powerful patterns available in Scala is the enrich-my-library* pattern, which uses implicit conversions to appear to add methods to existing classes without requiring dynamic method resolution. 一个Scala中最强大的模式是充实,我的图书馆*模式,它采用隐式转换出现添加方法,以现有的类,而不需要动态方法解析。 For example, if we wished that all strings had the method spaces that counted how many whitespace characters they had, we could: 例如,如果我们希望所有字符串都有方法spaces来计算它们有多少个空白字符,我们可以:

class SpaceCounter(s: String) {
  def spaces = s.count(_.isWhitespace)
}
implicit def string_counts_spaces(s: String) = new SpaceCounter(s)

scala> "How many spaces do I have?".spaces
res1: Int = 5

Unfortunately, this pattern runs into trouble when dealing with generic collections. 不幸的是,这种模式在处理泛型集合时遇到了麻烦。 For example, a number of questions have been asked about grouping items sequentially with collections . 例如,已经询问了许多关于按顺序对项目进行分组的问题。 There is nothing built in that works in one shot, so this seems an ideal candidate for the enrich-my-library pattern using a generic collection C and a generic element type A : 没有内置的东西可以一次性工作,所以这似乎是使用泛型集合C和泛型元素类型A富 - 我的库模式的理想候选者:

class SequentiallyGroupingCollection[A, C[A] <: Seq[A]](ca: C[A]) {
  def groupIdentical: C[C[A]] = {
    if (ca.isEmpty) C.empty[C[A]]
    else {
      val first = ca.head
      val (same,rest) = ca.span(_ == first)
      same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
    }
  }
}

except, of course, it doesn't work . 当然,除了它不起作用 The REPL tells us: REPL告诉我们:

<console>:12: error: not found: value C
               if (ca.isEmpty) C.empty[C[A]]
                               ^
<console>:16: error: type mismatch;
 found   : Seq[Seq[A]]
 required: C[C[A]]
                 same +: (new SequentiallyGroupingCollection(rest)).groupIdentical
                      ^

There are two problems: how do we get a C[C[A]] from an empty C[A] list (or from thin air)? 有两个问题:我们如何从空C[A]列表(或从空中)获得C[C[A]] And how do we get a C[C[A]] back from the same +: line instead of a Seq[Seq[A]] ? 我们如何从same +:行而不是Seq[Seq[A]]返回C[C[A]] Seq[Seq[A]]

* Formerly known as pimp-my-library. * 以前称为pimp-my-library。

The key to understanding this problem is to realize that there are two different ways to build and work with collections in the collections library. 理解这个问题的关键是要意识到在集合库中有两种不同的方法来构建和使用集合 One is the public collections interface with all its nice methods. 一个是公共集合界面及其所有不错的方法。 The other, which is used extensively in creating the collections library, but which are almost never used outside of it, is the builders. 另外,这是在创建集合库广泛使用,但几乎从未在其外部使用,是建设者。

Our problem in enriching is exactly the same one that the collections library itself faces when trying to return collections of the same type. 我们在丰富方面的问题与集合库本身在尝试返回相同类型的集合时面临的问题完全相同。 That is, we want to build collections, but when working generically, we don't have a way to refer to "the same type that the collection already is". 也就是说,我们想要构建集合,但是当一般地工作时,我们没有办法引用“集合已经存在的相同类型”。 So we need builders . 所以我们需要建设者

Now the question is: where do we get our builders from? 现在的问题是:我们从哪里获得建筑商? The obvious place is from the collection itself. 显而易见的地方来自收藏品本身。 This doesn't work . 这不起作用 We already decided, in moving to a generic collection, that we were going to forget the type of the collection. 在转向通用集合时,我们已经决定忘记集合的类型。 So even though the collection could return a builder that would generate more collections of the type we want, it wouldn't know what the type was. 因此,即使集合可以返回一个构建器,该构建器将生成我们想要的类型的更多集合,但它不知道该类型是什么。

Instead, we get our builders from CanBuildFrom implicits that are floating around. 相反,我们从CanBuildFrom我们的建设者得到CanBuildFrom暗示。 These exist specifically for the purpose of matching input and output types and giving you an appropriately typed builder. 它们专门用于匹配输入和输出类型,并为您提供适当类型的构建器。

So, we have two conceptual leaps to make: 因此,我们有两个概念上的飞跃:

  1. We aren't using standard collections operations, we're using builders. 我们没有使用标准集合操作,我们正在使用构建器。
  2. We get these builders from implicit CanBuildFrom s, not from our collection directly. 我们从隐式CanBuildFrom获取这些构建器,而不是直接从我们的集合中获取。

Let's look at an example. 我们来看一个例子。

class GroupingCollection[A, C[A] <: Iterable[A]](ca: C[A]) {
  import collection.generic.CanBuildFrom
  def groupedWhile(p: (A,A) => Boolean)(
    implicit cbfcc: CanBuildFrom[C[A],C[A],C[C[A]]], cbfc: CanBuildFrom[C[A],A,C[A]]
  ): C[C[A]] = {
    val it = ca.iterator
    val cca = cbfcc()
    if (!it.hasNext) cca.result
    else {
      val as = cbfc()
      var olda = it.next
      as += olda
      while (it.hasNext) {
        val a = it.next
        if (p(olda,a)) as += a
        else { cca += as.result; as.clear; as += a }
        olda = a
      }
      cca += as.result
    }
    cca.result
  }
}
implicit def iterable_has_grouping[A, C[A] <: Iterable[A]](ca: C[A]) = {
  new GroupingCollection[A,C](ca)
}

Let's take this apart. 让我们分开吧。 First, in order to build the collection-of-collections, we know we'll need to build two types of collections: C[A] for each group, and C[C[A]] that gathers all the groups together. 首先,为了构建集合集合,我们知道我们需要构建两种类型的集合:每个集合的C[A]和集合所有组的C[C[A]] Thus, we need two builders, one that takes A s and builds C[A] s, and one that takes C[A] s and builds C[C[A]] s. 因此,我们需要两个助洗剂,一种采用A S和构建C[A] s和一个取C[A] S和构建C[C[A]]秒。 Looking at the type signature of CanBuildFrom , we see 看看CanBuildFrom的类型签名,我们看到了

CanBuildFrom[-From, -Elem, +To]

which means that CanBuildFrom wants to know the type of collection we're starting with--in our case, it's C[A] , and then the elements of the generated collection and the type of that collection. 这意味着CanBuildFrom想知道我们开始的集合类型 - 在我们的例子中,它是C[A] ,然后是生成的集合的元素和该集合的类型。 So we fill those in as implicit parameters cbfcc and cbfc . 所以我们将它们作为隐式参数cbfcccbfc

Having realized this, that's most of the work. 意识到这一点,这是大部分工作。 We can use our CanBuildFrom s to give us builders (all you need to do is apply them). 我们可以使用我们的CanBuildFrom来为我们提供构建器(您需要做的就是应用它们)。 And one builder can build up a collection with += , convert it to the collection it is supposed to ultimately be with result , and empty itself and be ready to start again with clear . 并且一个构建器可以使用+=构建一个集合,将其转换为最终应该与result的集合,并将其自身清空并准备好以clear重新开始。 The builders start off empty, which solves our first compile error, and since we're using builders instead of recursion, the second error also goes away. 构建器开始为空,这解决了我们的第一个编译错误,并且因为我们使用构建器而不是递归,所以第二个错误也消失了。

One last little detail--other than the algorithm that actually does the work--is in the implicit conversion. 最后一个小细节 - 除了实际完成工作的算法之外 - 是隐式转换。 Note that we use new GroupingCollection[A,C] not [A,C[A]] . 请注意,我们使用new GroupingCollection[A,C]而不是[A,C[A]] This is because the class declaration was for C with one parameter, which it fills it itself with the A passed to it. 这是因为类的声明是为C带有一个参数,它填充它本身带有A传递给它。 So we just hand it the type C , and let it create C[A] out of it. 所以我们只需将它交给C ,然后让它创建C[A] Minor detail, but you'll get compile-time errors if you try another way. 一些细节,但如果你尝试另一种方式,你会得到编译时错误。

Here, I've made the method a little bit more generic than the "equal elements" collection--rather, the method cuts the original collection apart whenever its test of sequential elements fails. 在这里,我使方法比“等元素”集合更通用 - 相反,只要对顺序元素的测试失败,该方法就会将原始集合分开。

Let's see our method in action: 让我们看看我们的方法:

scala> List(1,2,2,2,3,4,4,4,5,5,1,1,1,2).groupedWhile(_ == _)
res0: List[List[Int]] = List(List(1), List(2, 2, 2), List(3), List(4, 4, 4), 
                             List(5, 5), List(1, 1, 1), List(2))

scala> Vector(1,2,3,4,1,2,3,1,2,1).groupedWhile(_ < _)
res1: scala.collection.immutable.Vector[scala.collection.immutable.Vector[Int]] =
  Vector(Vector(1, 2, 3, 4), Vector(1, 2, 3), Vector(1, 2), Vector(1))

It works! 有用!

The only problem is that we don't in general have these methods available for arrays, since that would require two implicit conversions in a row. 唯一的问题是我们通常没有这些方法可用于数组,因为这需要连续两次隐式转换。 There are several ways to get around this, including writing a separate implicit conversion for arrays, casting to WrappedArray , and so on. 有几种方法可以解决这个问题,包括为数组编写单独的隐式转换,转换为WrappedArray等等。


Edit: My favored approach for dealing with arrays and strings and such is to make the code even more generic and then use appropriate implicit conversions to make them more specific again in such a way that arrays work also. 编辑:我最喜欢的处理数组和字符串的方法是使代码通用,然后使用适当的隐式转换使它们更具体,使数组也能工作。 In this particular case: 在这种特殊情况下:

class GroupingCollection[A, C, D[C]](ca: C)(
  implicit c2i: C => Iterable[A],
           cbf: CanBuildFrom[C,C,D[C]],
           cbfi: CanBuildFrom[C,A,C]
) {
  def groupedWhile(p: (A,A) => Boolean): D[C] = {
    val it = c2i(ca).iterator
    val cca = cbf()
    if (!it.hasNext) cca.result
    else {
      val as = cbfi()
      var olda = it.next
      as += olda
      while (it.hasNext) {
        val a = it.next
        if (p(olda,a)) as += a
        else { cca += as.result; as.clear; as += a }
        olda = a
      }
      cca += as.result
    }
    cca.result
  }
}

Here we've added an implicit that gives us an Iterable[A] from C --for most collections this will just be the identity (eg List[A] already is an Iterable[A] ), but for arrays it will be a real implicit conversion. 在这里我们添加了一个隐式的,它给了我一个来自CIterable[A] - 对于大多数集合来说,这只是身份(例如List[A]已经是Iterable[A] ),但是对于数组,它将是一个真正的隐式转换。 And, consequently, we've dropped the requirement that C[A] <: Iterable[A] --we've basically just made the requirement for <% explicit, so we can use it explicitly at will instead of having the compiler fill it in for us. 因此,我们已经放弃了C[A] <: Iterable[A]的要求 - 我们基本上只需要<% explicit,因此我们可以随意使用它而不是编译器填充它适合我们。 Also, we have relaxed the restriction that our collection-of-collections is C[C[A]] --instead, it's any D[C] , which we will fill in later to be what we want. 此外,我们已经放宽了我们收藏品集合C[C[A]]限制 - 而且,它是任何D[C] ,我们将在后面填写它们是我们想要的。 Because we're going to fill this in later, we've pushed it up to the class level instead of the method level. 因为我们稍后将填写此内容,所以我们已将其推升到类级别而不是方法级别。 Otherwise, it's basically the same. 否则,它基本相同。

Now the question is how to use this. 现在的问题是如何使用它。 For regular collections, we can: 对于常规馆藏,我们可以:

implicit def collections_have_grouping[A, C[A]](ca: C[A])(
  implicit c2i: C[A] => Iterable[A],
           cbf: CanBuildFrom[C[A],C[A],C[C[A]]],
           cbfi: CanBuildFrom[C[A],A,C[A]]
) = {
  new GroupingCollection[A,C[A],C](ca)(c2i, cbf, cbfi)
}

where now we plug in C[A] for C and C[C[A]] for D[C] . 其中,现在我们插入C[A]CC[C[A]]D[C] Note that we do need the explicit generic types on the call to new GroupingCollection so it can keep straight which types correspond to what. 请注意,我们在调用new GroupingCollection确实需要显式泛型类型,因此它可以直接保持哪些类型对应于什么。 Thanks to the implicit c2i: C[A] => Iterable[A] , this automatically handles arrays. 由于implicit c2i: C[A] => Iterable[A] ,这会自动处理数组。

But wait, what if we want to use strings? 但是等等,如果我们想要使用字符串怎么办? Now we're in trouble, because you can't have a "string of strings". 现在我们遇到了麻烦,因为你不能拥有一串“字符串”。 This is where the extra abstraction helps: we can call D something that's suitable to hold strings. 这是额外抽象有用的地方:我们可以调用D适合保存字符串的东西。 Let's pick Vector , and do the following: 让我们选择Vector ,并执行以下操作:

val vector_string_builder = (
  new CanBuildFrom[String, String, Vector[String]] {
    def apply() = Vector.newBuilder[String]
    def apply(from: String) = this.apply()
  }
)

implicit def strings_have_grouping(s: String)(
  implicit c2i: String => Iterable[Char],
           cbfi: CanBuildFrom[String,Char,String]
) = {
  new GroupingCollection[Char,String,Vector](s)(
    c2i, vector_string_builder, cbfi
  )
}

We need a new CanBuildFrom to handle the building of a vector of strings (but this is really easy, since we just need to call Vector.newBuilder[String] ), and then we need to fill in all the types so that the GroupingCollection is typed sensibly. 我们需要一个新的CanBuildFrom来处理字符串向量的构建(但这很简单,因为我们只需要调用Vector.newBuilder[String] ),然后我们需要填写所有类型,以便GroupingCollection是理智地打字。 Note that we already have floating around a [String,Char,String] CanBuildFrom, so strings can be made from collections of chars. 请注意,我们已经在[String,Char,String] CanBuildFrom周围浮动,因此可以从字符集合中创建字符串。

Let's try it out: 我们来试试吧:

scala> List(true,false,true,true,true).groupedWhile(_ == _)
res1: List[List[Boolean]] = List(List(true), List(false), List(true, true, true))

scala> Array(1,2,5,3,5,6,7,4,1).groupedWhile(_ <= _) 
res2: Array[Array[Int]] = Array(Array(1, 2, 5), Array(3, 5, 6, 7), Array(4), Array(1))

scala> "Hello there!!".groupedWhile(_.isLetter == _.isLetter)
res3: Vector[String] = Vector(Hello,  , there, !!)

As of this commit it's a lot easier to "enrich" Scala collections than it was when Rex gave his excellent answer. 此提交中 ,“充实”Scala集合要比Rex给出的出色答案要容易得多。 For simple cases it might look like this, 对于简单的情况,它可能看起来像这样,

import scala.collection.generic.{ CanBuildFrom, FromRepr, HasElem }
import language.implicitConversions

class FilterMapImpl[A, Repr](val r : Repr)(implicit hasElem : HasElem[Repr, A]) {
  def filterMap[B, That](f : A => Option[B])
    (implicit cbf : CanBuildFrom[Repr, B, That]) : That = r.flatMap(f(_).toSeq)
}

implicit def filterMap[Repr : FromRepr](r : Repr) = new FilterMapImpl(r)

which adds a "same result type" respecting filterMap operation to all GenTraversableLike s, 它将filterMap操作的“相同结果类型” filterMap到所有GenTraversableLike

scala> val l = List(1, 2, 3, 4, 5)
l: List[Int] = List(1, 2, 3, 4, 5)

scala> l.filterMap(i => if(i % 2 == 0) Some(i) else None)
res0: List[Int] = List(2, 4)

scala> val a = Array(1, 2, 3, 4, 5)
a: Array[Int] = Array(1, 2, 3, 4, 5)

scala> a.filterMap(i => if(i % 2 == 0) Some(i) else None)
res1: Array[Int] = Array(2, 4)

scala> val s = "Hello World"
s: String = Hello World

scala> s.filterMap(c => if(c >= 'A' && c <= 'Z') Some(c) else None)
res2: String = HW

And for the example from the question, the solution now looks like, 对于问题的例子,解决方案现在看起来像,

class GroupIdenticalImpl[A, Repr : FromRepr](val r: Repr)
  (implicit hasElem : HasElem[Repr, A]) {
  def groupIdentical[That](implicit cbf: CanBuildFrom[Repr,Repr,That]): That = {
    val builder = cbf(r)
    def group(r: Repr) : Unit = {
      val first = r.head
      val (same, rest) = r.span(_ == first)
      builder += same
      if(!rest.isEmpty)
        group(rest)
    }
    if(!r.isEmpty) group(r)
    builder.result
  }
}

implicit def groupIdentical[Repr : FromRepr](r: Repr) = new GroupIdenticalImpl(r)

Sample REPL session, 示例REPL会话,

scala> val l = List(1, 1, 2, 2, 3, 3, 1, 1)
l: List[Int] = List(1, 1, 2, 2, 3, 3, 1, 1)

scala> l.groupIdentical
res0: List[List[Int]] = List(List(1, 1),List(2, 2),List(3, 3),List(1, 1))

scala> val a = Array(1, 1, 2, 2, 3, 3, 1, 1)
a: Array[Int] = Array(1, 1, 2, 2, 3, 3, 1, 1)

scala> a.groupIdentical
res1: Array[Array[Int]] = Array(Array(1, 1),Array(2, 2),Array(3, 3),Array(1, 1))

scala> val s = "11223311"
s: String = 11223311

scala> s.groupIdentical
res2: scala.collection.immutable.IndexedSeq[String] = Vector(11, 22, 33, 11)

Again, note that the same result type principle has been observed in exactly the same way that it would have been had groupIdentical been directly defined on GenTraversableLike . 同样,请注意,已经观察到相同的结果类型原则,与在groupIdentical上直接定义GenTraversableLike

As of this commit the magic incantation is slightly changed from what it was when Miles gave his excellent answer. 这个提交中 ,魔术咒语与Miles给出的出色答案略有不同。

The following works, but is it canonical? 以下作品,但它是规范的吗? I hope one of the canons will correct it. 我希望其中一个经典能够纠正它。 (Or rather, cannons, one of the big guns.) If the view bound is an upper bound, you lose application to Array and String. (或者说,大炮,大枪之一。)如果视图绑定是上限,则会丢失对Array和String的应用程序。 It doesn't seem to matter if the bound is GenTraversableLike or TraversableLike; 如果绑定是GenTraversableLike或TraversableLike似乎并不重要; but IsTraversableLike gives you a GenTraversableLike. 但IsTraversableLike为您提供了GenTraversableLike。

import language.implicitConversions
import scala.collection.{ GenTraversable=>GT, GenTraversableLike=>GTL, TraversableLike=>TL }
import scala.collection.generic.{ CanBuildFrom=>CBF, IsTraversableLike=>ITL }

class GroupIdenticalImpl[A, R <% GTL[_,R]](val r: GTL[A,R]) {
  def groupIdentical[That](implicit cbf: CBF[R, R, That]): That = {
    val builder = cbf(r.repr)
    def group(r: GTL[_,R]) {
      val first = r.head
      val (same, rest) = r.span(_ == first)
      builder += same
      if (!rest.isEmpty) group(rest)
    }
    if (!r.isEmpty) group(r)
    builder.result
  }
}

implicit def groupIdentical[A, R <% GTL[_,R]](r: R)(implicit fr: ITL[R]):
  GroupIdenticalImpl[fr.A, R] =
  new GroupIdenticalImpl(fr conversion r)

There's more than one way to skin a cat with nine lives. 只有一种方法可以让有九条生命的猫皮肤美化。 This version says that once my source is converted to a GenTraversableLike, as long as I can build the result from GenTraversable, just do that. 这个版本说,一旦我的源转换为GenTraversableLike,只要我可以从GenTraversable构建结果,就这样做。 I'm not interested in my old Repr. 我对我的旧Repr不感兴趣。

class GroupIdenticalImpl[A, R](val r: GTL[A,R]) {
  def groupIdentical[That](implicit cbf: CBF[GT[A], GT[A], That]): That = {
    val builder = cbf(r.toTraversable)
    def group(r: GT[A]) {
      val first = r.head
      val (same, rest) = r.span(_ == first)
      builder += same
      if (!rest.isEmpty) group(rest)
    }
    if (!r.isEmpty) group(r.toTraversable)
    builder.result
  }
}

implicit def groupIdentical[A, R](r: R)(implicit fr: ITL[R]):
  GroupIdenticalImpl[fr.A, R] =
  new GroupIdenticalImpl(fr conversion r)

This first attempt includes an ugly conversion of Repr to GenTraversableLike. 第一次尝试包括将Repr转换为GenTraversableLike。

import language.implicitConversions
import scala.collection.{ GenTraversableLike }
import scala.collection.generic.{ CanBuildFrom, IsTraversableLike }

type GT[A, B] = GenTraversableLike[A, B]
type CBF[A, B, C] = CanBuildFrom[A, B, C]
type ITL[A] = IsTraversableLike[A]

class FilterMapImpl[A, Repr](val r: GenTraversableLike[A, Repr]) { 
  def filterMap[B, That](f: A => Option[B])(implicit cbf : CanBuildFrom[Repr, B, That]): That = 
    r.flatMap(f(_).toSeq)
} 

implicit def filterMap[A, Repr](r: Repr)(implicit fr: ITL[Repr]): FilterMapImpl[fr.A, Repr] = 
  new FilterMapImpl(fr conversion r)

class GroupIdenticalImpl[A, R](val r: GT[A,R])(implicit fr: ITL[R]) { 
  def groupIdentical[That](implicit cbf: CBF[R, R, That]): That = { 
    val builder = cbf(r.repr)
    def group(r0: R) { 
      val r = fr conversion r0
      val first = r.head
      val (same, other) = r.span(_ == first)
      builder += same
      val rest = fr conversion other
      if (!rest.isEmpty) group(rest.repr)
    } 
    if (!r.isEmpty) group(r.repr)
    builder.result
  } 
} 

implicit def groupIdentical[A, R](r: R)(implicit fr: ITL[R]):
  GroupIdenticalImpl[fr.A, R] = 
  new GroupIdenticalImpl(fr conversion r)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM