简体   繁体   English

用正则表达式scala匹配字符串

[英]matching a string with regex scala

I wrote the following function: 我写了以下函数:

import scala.util.matching.Regex
val COL1 = "COL1"
val COL2 = "COL2"
val COL3 = "COL3"
val COL4 = "COL4"
val COL5 = "COL5"
val reg = ".+-([\w\d]{3})-([\d\w]{3})-([\d\w]{3})-([\w]+)$-([\w]+)".r.unanchored
val dataExtraction: String => Map[String, String] = {
  string: String => {
    string match {
      case reg(col1, col2, col3, col4, col5) =>
                 Map(COL1 -> col1, COL2 -> col2, COL3 -> col3, COL4 -> col4 ,COL5 -> col5 )
      case _  => Map(COL1 -> "", COL2 -> "", COL3 -> "", COL4 -> "" ,COL5 -> "" )
    }
  }
}

it is supposed to parse strings like "dep-gll-cde3-l4-result" or "cde3-gll-dep-l4-result" 应该解析“ dep-gll-cde3-l4-result”或“ cde3-gll-dep-l4-result”这样的字符串

any idea how to define a regex parsing both of these 任何想法如何定义解析这两个的正则表达式

You may use the following regex: 您可以使用以下正则表达式:

val reg = """(\w{3,4})-(\w{3})-(\w{3,4})-(\w+)-(\w+)""".r

You need not make it unanchored since that pattern matches your whole inputs. 您无需取消固定它,因为该模式与您的整个输入匹配。

Note that inside a triple quoted string literal you may define backslashes with a single \\ , in your case, they need doubling. 请注意,在三引号括起来的字符串文字中,您可以使用单个\\定义反斜杠,在这种情况下,它们需要加倍。 Also, see the {3,4} quantifiers that seem sufficient for the cases you provided. 另外,请参阅似乎足以满足您提供的情况的{3,4}量词。

See the online Scala demo and the regex demo . 请参阅在线Scala演示regex演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM