简体   繁体   English

如何在Scala中实现“转义”?

[英]How to implement “unescape” in Scala?

This is a follow-up to my previous question 这是我先前问题的跟进

Thanks to the answers I realized that the escape function is actually a flatMap with argument f:Char => Seq[Char] to map escaped characters to escaping sequences (see the answers). 多亏了答案,我才意识到escape函数实际上是一个带有参数f:Char => Seq[Char]flatMap ,用于将转义字符映射到转义序列(请参见答案)。

Now I wonder how to implement unescape as a reverse operation to escape . 现在,我想知道如何将unescape实现为反向操作以进行escape I guess tt should be a reverse to flatMap with argument f:Seq[Char] => Char . 我猜tt应该与参数f:Seq[Char] => Char flatMap相反。 Does it make sense ? 是否有意义 ? How would you suggest implement unescape ? 您如何建议实施unescape

I guess tt should be a reverse to flatMap with a function f:Seq[Char] => Char. 我想tt应该是带有函数f:Seq [Char] => Char的flatMap的反向版本。 Does it make sense ? 是否有意义 ?

Not really. 并不是的。 What should your inverse function f:Seq[Char] => Char return on "abc" ? 您的反函数f:Seq[Char] => Char"abc"上返回什么? It should apply to any sequence of characters and return a single character. 它应适用于任何字符序列并返回单个字符。 You could try using PartialFunction[Seq[Char], Char] instead, but you'll run into other problems. 您可以尝试使用PartialFunction[Seq[Char], Char]代替,但是会遇到其他问题。 Do you apply it to every subsequence of your input? 您是否将其应用于输入的每个子序列?

The more general solution would be to use foldLeft with the accumulator type containing both the built-up part of the result and the escaping sequence, something like (untested): 更通用的解决方案是将foldLeft与累加器类型一起使用,该累加器类型既包含结果的累积部分,又包含转义序列,例如(未经测试):

def unescape(str: String) = {
  val result = str.foldLeft[(String, Option[String])](("", None)) { case ((acc, escapedAcc), c) => 
    (c, escapedAcc) match {
      case ('&', None) =>
        (acc, Some(""))
      case (_, None) =>
        (acc + c, None)
      case ('&', Some(_)) =>
        throw new IllegalArgumentException("nested escape sequences")
      case (';', Some(escapedAcc1)) => 
        (acc + unescapeMap(escapedAcc1), None)
      case (_,  Some(escapedAcc1)) =>
        (acc, Some(escapedAcc1 + c))
    }
  }

  result match {
    case (escaped, None) =>
      escaped
    case (_, Some(_)) => 
      throw new IllegalArgumentException("unfinished escape sequence")
  }
}

val unescapeMap = Map("amp" -> "&", "lt" -> "<", ...)

(It's much more efficient to use StringBuilder s for the accumulators, but this is simpler to understand.) (对累加器使用StringBuilder效率更高,但这更容易理解。)

But for this specific case you could just split the string on & , then split each part except first on ; 但是对于这种特定情况,您可以只在&上分割字符串,然后分割除第一部分外的每个部分; , and get the parts you want this way. ,并以此方式获取所需零件。

This seems to be a follow-up to my own answer to the question whose follow-up this question is... use scala.xml.Utility.unescape : 这似乎是 对该问题回答的后续问题该问题的后续问题是...使用scala.xml.Utility.unescape

val sb = new StringBuilder
scala.xml.Utility.unescape("amp", sb)
println(sb.toString) // prints &

or if you just want to unescape once and throw away the StringBuilder instance: 或者,如果您只想取消转义并丢弃StringBuilder实例,则:

scala.xml.Utility.unescape("amp", new StringBuilder).toString // returns "&"

This just parses individual escapes; 这只是解析单个逃生; you'll have to build a parser of entire XML strings around it yourself—the accepted answer seems to provide that bit but fails to not reinvent the scala.xml.Utility wheel— or use something from scala.xml instead. 您必须自己围绕它构建整个XML字符串的解析器-公认的答案似乎提供了这一点,但是未能重新发明scala.xml.Utility或改用scala.xml

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM