简体   繁体   中英

Matching against a regular expression in Scala

I fairly frequently match strings against regular expressions. In Java:

java.util.regex.Pattern.compile("\\w+").matcher("this_is").matches

Ouch. Scala has many alternatives.

  1. "\\\\w+".r.pattern.matcher("this_is").matches
  2. "this_is".matches("\\\\w+")
  3. "\\\\w+".r unapplySeq "this_is" isDefined
  4. val R = "\\\\w+".r; "this_is" match { case R() => true; case _ => false}

The first is just as heavy-weight as the Java code.

The problem with the second is that you can't supply a compiled pattern ( "this_is".matches("\\\\w+".r") ). (This seems to be an anti-pattern since almost every time there is a method that takes a regex to compile there is an overload that takes a regex).

The problem with the third is that it abuses unapplySeq and thus is cryptic.

The fourth is great when decomposing parts of a regular expression, but is too heavy-weight when you only want a boolean result.

Am I missing an easy way to check for matches against a regular expression? Is there a reason why String#matches(regex: Regex): Boolean is not defined? In fact, where is String#matches(uncompiled: String): Boolean defined?

You can define a pattern like this :

scala> val Email = """(\w+)@([\w\.]+)""".r

findFirstIn will return Some[String] if it matches or else None .

scala> Email.findFirstIn("test@example.com")
res1: Option[String] = Some(test@example.com)

scala> Email.findFirstIn("test")
rest2: Option[String] = None

You could even extract :

scala> val Email(name, domain) = "test@example.com"
name: String = test
domain: String = example.com

Finally, you can also use conventional String.matches method (and even recycle the previously defined Email Regexp :

scala> "david@example.com".matches(Email.toString)
res6: Boolean = true

Hope this will help.

I created a little "Pimp my Library" pattern for that problem. Maybe it'll help you out.

import util.matching.Regex

object RegexUtils {
  class RichRegex(self: Regex) {
    def =~(s: String) = self.pattern.matcher(s).matches
  }
  implicit def regexToRichRegex(r: Regex) = new RichRegex(r)
}

Example of use

scala> import RegexUtils._
scala> """\w+""".r =~ "foo"
res12: Boolean = true

I usually use

val regex = "...".r
if (regex.findFirstIn(text).isDefined) ...

but I think that is pretty awkward.

Currently (Aug 2014, Scala 2.11) @David's reply tells the norm.

However, it seems the r."..." string interpolator may be on its way to help with this. See How to pattern match using regular expression in Scala?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM