简体   繁体   English

Scala:正则表达式模式与花括号匹配吗?

[英]Scala: Regular Expression pattern match with curly braces?

so I am creating an WML like language for my assignment and as a first step, I am supposed to create regular expressions to recognize the following: 所以我要为我的作业创建类似WML的语言,并且第一步,我应该创建正则表达式以识别以下内容:

//single = "{"
//double = "{{"
//triple = "{{{"

here is my code for the second one: 这是我的第二个代码:

val double = "\\{\\{\\b".r

and my Test is: 我的测试是:

println(double.findAllIn("{{ s{{ { {{{ {{ {{x").toArray.mkString(" "))

Bit it doesn't print anything ! 一点都不打印! It's supposed to print the first, second, fifth and 6th token. 它应该打印第一个,第二个,第五个和第六个令牌。 I have tried every single combination of \\b and \\B and even \\{{2,2} instead of \\{\\{ but it's still not working. 我已经尝试了\\ b和\\ B甚至甚至是\\ {{2,2}而不是\\ {\\ {的每个组合,但是仍然无法正常工作。 Any help?? 有帮助吗?

As a side question, If I wanted it to match just the first and fifth tokens, what would I need to do? 作为附带的问题,如果我希望它仅与第一个和第五个令牌匹配,我该怎么办?

I tested your code (Scala 2.12.2 REPL), and in contrary to your "it doesn't print anything" statement, it actually prints "{{" occurrence from "{{x" substring. 我测试了您的代码(Scala 2.12.2 REPL),与您的“不打印任何内容”语句相反,它实际上从“ {{x””子字符串打印“ {{”出现的情况。

This is because x is a word character and \\b matches a position between second { and x . 这是因为x是单词字符, \\b匹配第二{x之间的位置 Keep in mind that { isn't a word character, unlike x . 请记住, {不是单词字符,与x不同。

As per this tutorial 按照本教程

It matches at a position that is called a "word boundary". 它在称为“单词边界”的位置匹配。 This match is zero-length 这场比赛是零长度

There are three different positions that qualify as word boundaries: 有三个不同的位置可作为单词边界:

1) Before the first character in the string, if the first character is a word character 1)如果字符串中的第一个字符是单词字符,则在字符串中第一个字符之前

... ...

As for solution, it depends on precise definition, but lookarounds seemed to work for me: 至于解决方案,这取决于精确的定义,但是环顾四周似乎对我有用:

"(?<!\\{)\\{{2}(?!\\{)".r

It matched "first, second, fifth and 6th token". 它匹配了“第一,第二,第五和第六令牌”。 The expression says match "{{" not preceded and not followed by "{". 该表达式表示匹配项“ {{”不位于“ {”之前和之后。

For side-question: 附带问题:

"(?<![^ ])\\{\\{(?![^ ])".r //match `{` surrounded by spaces or line boundaries

Or, depending on your interpretation of "space": 或者,取决于您对“空间”的解释:

"(?<!\\S)\\{\\{(?!\\S)".r

matched 1st and 5th tokens. 匹配第一个和第五个令牌。 I couldn't use positive lookarounds coz I wanted to take line beginnings and endings (boundaries) into account automatically. 我不能使用积极的环视效果,因为我想自动考虑行的开头和结尾(边界)。 So double negation by ! 如此双重否定了! and [^ ] created an effect of implicit inclusion of ^ and $ . [^ ]产生了^$隐式包含的效果。 Alternatively, you could use: 或者,您可以使用:

"(?<=^|\\s)\\{\\{(?=\\s|$)".r

You can read about lookarounds here . 您可以在此处阅读有关环顾四周的信息 Basically they match the symbol or expression as boundary; 基本上,它们将符号或表达式匹配为边界; simply saying they match stuff but don't include it in the matched string itself. 只是说它们匹配的东西,但不要将其包含在匹配的字符串本身中。

Some examples of lookarounds 环视的一些例子

  • (?<=z)aaa matches "aaa" that is preceded by z (?<=z)aaa其前面有“AAA”匹配z
  • (?<!z)aaa matches "aaa" that is not preceded by z (?<!z)aaa匹配不带z “ aaa”
  • aaa(?=z) matches "aaa" followed by z aaa(?=z)匹配后跟z “ aaa”
  • aaa(?!z) matches "aaa" not followed by z aaa(?!z)匹配“ aaa”,后跟z

PS Just to make your life easier, Scala has """ for escaping, so let's say instead of: PS为了使您的生活更轻松,Scala带有"""来进行转义,因此,让我们代替:

"(?<!\\S)\\{\\{(?!\\S)".r

you can just: 您可以:

"""(?<!\S)\{\{(?!\S)""".r

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM