简体   繁体   English

Scala regex模式与字符串插值匹配

[英]Scala regex pattern matching with String Interpolation

From Scala 2.10 we can define new method r using StringContext like this: 在Scala 2.10中,我们可以使用StringContext定义新方法r,如下所示:

implicit class RegexContext(sc: StringContext) {
  def r = new Regex(sc.parts.mkString, sc.parts.tail.map(_ => "x"): _*)
}

Then we can easily define regex pattern after case keyword like this: 然后,我们可以在case关键字之后轻松定义正则表达式模式,如下所示:

"123" match { 
   case r"\d+" => true 
   case _ => false 
}

Which is not clear to me how the implementation inside of the implicit class RegexContext works 我不清楚这在隐式类RegexContext内部的实现是如何工作的

Can someone explain to me the implementation of the method r , especially sc.parts.tail.map(_ => "x"): _* ? 有人可以向我解释方法r的实现,尤其是sc.parts.tail.map(_ => "x"): _*吗?

The implementation is taken from How to pattern match using regular expression in Scala? 该实现取自如何在Scala中使用正则表达式进行模式匹配?

Those args are group names, not very useful here. 这些参数是组名,在这里不是很有用。

scala 2.13.0-M5> implicit class R(sc: StringContext) { def r = sc.parts.mkString.r }
defined class R

scala 2.13.0-M5> "hello" match { case r"hell.*" => }

Compare: 相比:

scala 2.13.0-M5> implicit class R(sc: StringContext) { def r = sc.parts.mkString("(.*)").r }
defined class R

scala 2.13.0-M5> "hello" match { case r"hell$x" => x }
res5: String = o

The Regex constructor takes two arguments. Regex构造函数带有两个参数。

new Regex (regex: String, groupNames: String*) 新的正则表达式 (正则表达式:字符串,组名:字符串*)

The groupNames parameter is a vararg so it (they) are actually optional and, in this case, it should have been left empty because that groupNames code is pretty useless. groupNames参数是一个vararg,因此它(它们)实际上是可选的,在这种情况下,应将其保留为空,因为groupNames代码几乎没有用。

Let's review what groupNames is supposed to do. 让我们回顾一下groupNames应该做什么。 We'll start without groupNames . 我们将从没有groupNames开始。

val rx = new Regex("~(A(.)C)~")  // pattern with 2 groups, no group names
rx.findAllIn("~ABC~").group(0) //res0: String = ~ABC~
rx.findAllIn("~ABC~").group(1) //res1: String = ABC
rx.findAllIn("~ABC~").group(2) //res2: String = B
rx.findAllIn("~ABC~").group(3) //java.lang.IndexOutOfBoundsException: No group 3

And now with groupNames . 现在有了groupNames

val rx = new Regex("~(A(.)C)~", "x", "y", "z")  // 3 groups named
rx.findAllIn("~ABC~").group("x") //res0: String = ABC
rx.findAllIn("~ABC~").group("y") //res1: String = B
rx.findAllIn("~ABC~").group("z") //java.lang.IndexOutOfBoundsException: No group 3

So why is sc.parts.tail.map(_ => "x"): _* so useless? 那么,为什么sc.parts.tail.map(_ => "x"): _*这么没用? First because the number of names created is unrelated to the number of groups in the pattern, but also because it uses the same string, "x" , for every name it specifies. 首先是因为创建的名称数量与模式中的组数量无关,而且还因为它为指定的每个名称都使用相同的字符串"x" That name will only be good for the last group named. 该名称仅对最后一个命名的组有用。

val rx = new Regex("~(A(.)C)~", "x", "x")  // 2 groups named
rx.findAllIn("~ABC~").group("x") //res0: String = B (i.e. group(2))

...and... ...和...

val rx = new Regex("~(A(.)C)~", "x", "x", "x")  // 3 groups named
rx.findAllIn("~ABC~").group("x") //java.lang.IndexOutOfBoundsException: No group 3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM