简体   繁体   English

编解码器 - 读取固定长度的字符串

[英]Scodec - Reading in a fixed-length String

I'm writing a file parser that is reading an existing file format that incorporates fixed length, 0 padded strings.我正在编写一个文件解析器,它正在读取包含固定长度、0 填充字符串的现有文件格式。

So, for example I've got two case classes for binary structures within the file I need to parse.因此,例如,我需要解析的文件中有两个用于二进制结构的 case 类。 The first includes a 4-character string that can be one of two values and the latter includes an 8 character string (where values < 8 characters in length are NUL padded)第一个包含一个 4 个字符的字符串,可以是两个值之一,后者包含一个 8 个字符的字符串(其中长度小于 8 个字符的值用 NUL 填充)

case class WadHeader( magic : String, items : Int, dirOffset : Int)
case class LumpIndex( offset : Int, size : Int, lumpName : String)

I've tried to write a simple codec to parse the first:我试图编写一个简单的编解码器来解析第一个:

  implicit val headerCodec : Codec[WadHeader] = {
    ("magic" | bytes(4)) ::
      ("items" | uint32) ::
      ("dirOffset" | uint32)
  }.as[WadHeader]

However, I'm finding that it can't successfully transform this into a WadHeader (presumably because the magic value does not completely match up with the case-class definition. I'd like to be able to ingest a fixed-size string of bytes and decode it into a String object.但是,我发现它无法成功将其转换为 WadHeader(大概是因为魔术值与案例类定义不完全匹配。我希望能够摄取固定大小的字符串字节并将其解码为 String 对象。

Unfortunately, scouring over the documentation only turns up the 'greedy' string, or size prefixed string options.不幸的是,搜索文档只会出现“贪婪”字符串或大小前缀字符串选项。

Ok - so I've figured out a solution that works ok.好的 - 所以我想出了一个可以正常工作的解决方案。 There's probably a simpler/cleaner way to do it, but this works pretty well.可能有一种更简单/更清洁的方法来做到这一点,但这很有效。

Firstly, I define a new fixedString codec for reading in strings when I know the length in advance:首先,当我提前知道长度时,我定义了一个新的 fixedString 编解码器用于读取字符串:

  def fixedString(size: Int): Codec[String] = new Codec[String] {
    private val codec = fixedSizeBytes(size, ascii)
    def sizeBound: SizeBound = SizeBound.exact(size * 8L)
    def encode(b: String): Attempt[BitVector] = codec.encode(b)
    def decode(b: BitVector): Attempt[DecodeResult[String]] = {
      codec.decode(b) match {
        case Successful(DecodeResult(value, remainder)) =>
          val decoded = value.toSeq.takeWhile(_>0).mkString

          Attempt.successful(DecodeResult(decoded, remainder))
        case fail : scodec.Attempt.Failure => fail
      }
    }
    override def toString = s"fixedString($size)"
  }

That works for the string.这适用于字符串。 The second was just a silly mistake on my part (uint32 decodes to a Long, not an Int), which required me to update my case class definition accordingly:第二个只是我的一个愚蠢的错误(uint32 解码为 Long,而不是 Int),这需要我相应地更新我的案例类定义:

case class WadHeader( magic : String, items : Long, dirOffset : Long)

object WadHeader {
  implicit val codec : Codec[WadHeader] = {
    ("magic" | fixedString(4)) ::
      ("items" | uint32) ::
      ("dirOffset" | uint32)
  }.as[WadHeader]
}

EDIT: 5/7 - Figured out that I can wrap fixedSizeCodec(size, ascii) instead of bytes and it does most of what I want and have updated the solution accordingly.编辑:5/7 - 发现我可以包装fixedSizeCodec(size, ascii)而不是bytes ,它完成了我想要的大部分工作,并相应地更新了解决方案。 Depending on the requirements, fixedSizeCodec(size, cstring) could be a really good solution too - however for my use case that fails for strings that use the full field length as there is no room for the terminating nul.根据要求, fixedSizeCodec(size, cstring)也可能是一个非常好的解决方案 - 但是对于我的用例,对于使用完整字段长度的字符串失败,因为没有空间用于终止 nul。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM