[英]Scala: Regex that matches everything up to a certain character
I want my regex to print everything before a { or {{ (not including them. 我希望我的正则表达式在{或{{之前(不包括它们。
What I have so far is: 到目前为止,我有:
class ExpressionParser extends RegexParsers {
val regExpr = """^.*?((?=\{{2})|(?=\{)|$)""".r //not sure about the "$". Added it because test case 1 wasn't printing. see below
def program: Parser[Any] = regExpr
}
and here are my tests: 这是我的测试:
object Test {
def main(args: Array[String]): Unit = {
val p = new ExpressionParser()
val test = p.parseAll(p.program, 'tests go here') // doesn't print anything
if(test.successful) println(test.get)
// replace 'tests go here' with each of these //分别将“ tests go here”替换为
//"This is plain text so should always print") // this isn't printing so make checks for { optional
//"abc {{"
//"abc de{ fg{{{ hi"
//"abc } {{ {{ de{' fg{{{ hi")
}
}
I want it to print: 我要打印:
//This is plain text so should always print
//abc
//abc de
//abc {
Only the first test prints. 仅第一个测试打印。 Why?
为什么?
Thanks ! 谢谢 !
Scroll down to edit to show answer after poster became more specific with answer 在海报变得更加具体之后,向下滚动进行编辑以显示答案
I've never heard of an ExpressionParser built into the Scala API, but if you want to get everything up to a certain point or between two things you can use 我从未听说过Scala API内置的ExpressionParser,但是如果您想将所有内容提高到某个点或介于两件事之间,则可以使用
(?s)(.*)
So to get everything before the letter 'a' you would use... 因此,要想得到字母“ a”之前的所有内容,您可以使用...
(?s)(.*)a
Code example: 代码示例:
val regex2 = """(?s)(.*)a""".r
val str1 = "somethinga"
str1 match {
case regex2(left) => println(left)
}
This will print "something" without quotes 这将打印不带引号的“内容”
Edit: Since you have now updated your answer to show you are using RegexParsers, here would be a solution using that, though quite over-the-top and unnecessary if this is all you are using RegexParsers for. 编辑:由于您现在已经更新了答案以显示您正在使用RegexParsers,因此这将是一个使用该解决方案的解决方案,尽管这是相当繁琐的操作,并且如果这就是您正在使用RegexParsers的全部内容,则是不必要的。
class ExpressionParser extends RegexParsers {
def remover: Parser[String] = """.*(?=\{)|.*""".r
}
In main: 在主要方面:
val p = new ExpressionParser()
val test = p.parseAll(p.remover, "tests go here{")// doesn't print anything
if (test.successful) println(test.get) // prints "tests go here"
Was able to figure this out by reading RegexParser documentation here: https://github.com/scala/scala-parser-combinators and https://github.com/scala/scala-parser-combinators/blob/1.1.x/docs/Getting_Started.md 通过在这里阅读RegexParser文档可以弄清楚这一点: https : //github.com/scala/scala-parser-combinators和https://github.com/scala/scala-parser-combinators/blob/1.1.x/文档/ Getting_Started.md
As for an explanation of this if the documentation still doesn't make sense, this is solved using "lookahead groups" which will look ahead of the previous group for the pattern matching the lookahead group and exclude it from the result. 至于如果文档仍然没有意义的解释,则使用“先行组”解决该问题,“先行组”将在前一组之前查找与先行组匹配的模式,并将其从结果中排除。
Therefore, once you hit a {, it will match the expression of everything up to the { and return that. 因此,一旦您击中{,它将匹配所有表达式直至{并返回。
Now the reason for the | 现在的原因| is it will initially try to match "everything followed by a {" but if it doesn't, there would be an issue.
是它最初会尝试匹配“后跟{的所有内容”,但如果不匹配,则会出现问题。 Therefore, we must use an "or (|)" to say if there isn't a {, just use everything.
因此,我们必须使用“或(|)”来表示是否没有{,请使用所有内容。
The reason why we cant just add a ? 为什么我们不能只添加一个? to the left part of the |
在|的左侧 at the end of the lookahead group to make the lookahead group optional is it wouldn't actually remove the lookahead group.
在lookahead组的末尾使lookahead组成为可选项,因为它实际上不会删除lookahead组。 You can try it out if you want with this regex.
如果需要此正则表达式,可以尝试一下。
.*(?=\{)?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.