简体   繁体   English

使用正则表达式匹配提取两个子字符串

[英]Extracting two substrings using a regex match

I have a simple regex pattern and a string. 我有一个简单的正则表达式模式和一个字符串。 The regex pattern is (\\d*)\\*(C.*) and the string is 32*C1234*3 . 正则表达式模式为(\\d*)\\*(C.*) ,字符串为32*C1234*3

I want to extract the values 32 and C1234*3 from that string using that regex pattern. 我想使用该正则表达式模式从该字符串中提取值32C1234*3

I can achieve that in Perl using the following code: 我可以使用以下代码在Perl中实现该目标:

my $x = "32*C1234*3"; 
if ( my ($x1, $x2) = $x =~ /(\d*)\*(C.*)/ ) {
    print "x1 = $x1, x2 = $x2\n";
} else {
    print "No match\n";
}

The following is my attempt using Scala 2.11: 以下是我使用Scala 2.11的尝试:

val s = "32*C1234*3"
val re = """(\d*)\*(C.*)""".r
(r findFirstIn s).toList

When I run this, I get a single value (and it is the original list followed by empty character) instead of 32 and C1234*3 . 运行此命令时,我得到一个值(它是原始列表,后跟一个空字符),而不是32C1234*3 How do I fix the scala code? 如何修复Scala代码?

s match {
  case re(a,b) => s"x1 = $a , x2 = $b"
  case _ => "error"
}

You have 2 capture groups in the regex pattern so the match pattern has to offer 2 variables to hold the matched values. 在正则表达式模式中有2个捕获组,因此匹配模式必须提供2个变量来保存匹配的值。


To get all matched capture groups as a List[String] you could do this: 要将所有匹配的捕获组作为List[String] ,可以执行以下操作:

re.unapplySeq(s).getOrElse(List.empty)

Use findFirstMatchIn , and then use the Match object to select the groups that you need: 使用findFirstMatchIn ,然后使用Match对象选择所需的组:

val m = re.findFirstMatchIn(s).get
val x1 = m.group(1)
val x2 = m.group(2)

sets the variables m , x1 , x2 to: 将变量mx1x2为:

m: scala.util.matching.Regex.Match = 32*C1234*3*
x1: String = 32
x2: String = C1234*3*

The if ... -part in Perl is replaced by Option[Match] in Scala. Perl中的if ... -part被Scala中的Option[Match]替换。 If you are not sure whether the string actually matches or not, you have to first check whether the result of findFirstMatchIn isEmpty or not. 如果你不能确定该字符串是否确实匹配与否,你必须首先检查是否结果findFirstMatchIn isEmpty与否。 Alternatively, you can pattern match on the returned Option . 或者,您可以在返回的Option上进行模式匹配。 In the above example, I simply used get , because it was obvious that there is a match. 在上面的示例中,我简单地使用了get ,因为很明显有一个匹配项。

The findFirstMatchIn is unanchored, so that the string "blah|32*C1234*3*" would return the same match with the same groups. 不固定findFirstMatchIn ,因此字符串"blah|32*C1234*3*"将返回具有相同组的相同匹配项。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM