简体   繁体   中英

Scala regex pattern match of ip address

I can't understand why this code returns false:

      val reg = """.*(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r
      "ttt20.30.4.140ttt" match{
        case reg(one, two, three, four) =>
          if (host == one + "." + two + "." + three + "." + four) true else false
        case _ => false
      }

and only if I change it to:

  val reg = """.*(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r
  "20.30.4.140" match{
    case reg(one, two, three, four) =>
      if (host == one + "." + two + "." + three + "." + four) true else false
    case _ => false
  }

it does match

Your variant

def main( args: Array[String] ) : Unit = {
  val regex = """.*(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r
  val x = "ttt20.30.4.140ttt"

  x match {
    case regex(ip1,ip2,ip3,ip4) => println(ip1, ip2, ip3, ip4)
    case _ => println("No match.")
  }
}

matches, but not as you intend. Result will be (0,30,4,140) instead of ( 20 ,30,4,140). As you can see .* is greedy, so consumes as much input as it can.

eg ab12 could be separated via .*(\\d{1,3}) into

  • ab and 12
  • ab1 and 2 .... this is the variant chosen, as .* consumes as much input as it can

Solutions

  1. Make .* reluctant (and not greedy), that is .*? so in total

     """.*?(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3}).*""".r 
  2. Precisely define the pattern before the first number, eg if these are only characters, do

     """[a-zA-Z]*(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3}).*""".r 

您应该使用勉强的量词而不是贪婪的量词

val reg = """.*?(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}).*""".r

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM