简体   繁体   English

在Scala中使用circe解码结构化JSON数组

[英]Decoding structured JSON arrays with circe in Scala

Suppose I need to decode JSON arrays that look like the following, where there are a couple of fields at the beginning, some arbitrary number of homogeneous elements, and then some other field: 假设我需要对如下所示的JSON数组进行解码,其中在开始处有几个字段,一些任意数量的齐次元素,然后是其他字段:

[ "Foo", "McBar", true, false, false, false, true, 137 ]

I don't know why anyone would choose to encode their data like this, but people do weird things, and suppose in this case I just have to deal with it. 我不知道为什么有人会选择像这样对他们的数据进行编码,但是人们做的事情很奇怪,并且在这种情况下,我只需要处理它。

I want to decode this JSON into a case class like this: 我想将此JSON解码为如下的case类:

case class Foo(firstName: String, lastName: String, age: Int, stuff: List[Boolean])

We can write something like this: 我们可以这样写:

import cats.syntax.either._
import io.circe.{ Decoder, DecodingFailure, Json }

implicit val fooDecoder: Decoder[Foo] = Decoder.instance { c =>
  c.focus.flatMap(_.asArray) match {
    case Some(fnJ +: lnJ +: rest) =>
      rest.reverse match {
        case ageJ +: stuffJ =>
          for {
            fn    <- fnJ.as[String]
            ln    <- lnJ.as[String]
            age   <- ageJ.as[Int]
            stuff <- Json.fromValues(stuffJ.reverse).as[List[Boolean]]
          } yield Foo(fn, ln, age, stuff)
        case _ => Left(DecodingFailure("Foo", c.history))
      }
    case None => Left(DecodingFailure("Foo", c.history))
  }
}

…which works: …有效:

scala> fooDecoder.decodeJson(json"""[ "Foo", "McBar", true, false, 137 ]""")
res3: io.circe.Decoder.Result[Foo] = Right(Foo(Foo,McBar,137,List(true, false)))

But ugh, that's horrible. 但是,那太可怕了。 Also the error messages are completely useless: 错误消息也完全没有用:

scala> fooDecoder.decodeJson(json"""[ "Foo", "McBar", true, false ]""")
res4: io.circe.Decoder.Result[Foo] = Left(DecodingFailure(Int, List()))

Surely there's a way to do this that doesn't involve switching back and forth between cursors and Json values, throwing away history in our error messages, and just generally being an eyesore? 当然,有一种方法可以做到,不涉及在游标和Json值之间来回切换,不丢弃错误消息中的历史记录,而只是使人眼花?乱?


Some context: questions about writing custom JSON array decoders like this in circe come up fairly often (eg this morning ). 一些上下文:关于编写这样的自定义JSON数组解码器的问题大约经常出现(例如, 今天早上 )。 The specific details of how to do this are likely to change in an upcoming version of circe (although the API will be similar; see this experimental project for some details), so I don't really want to spend a lot of time adding an example like this to the documentation, but it comes up enough that I think it does deserve a Stack Overflow Q&A. 在即将发布的circe版本中,如何执行此操作的具体细节可能会发生变化(尽管API将会类似;有关更多详细信息,请参见此实验项目 ),因此,我真的不想花很多时间来添加文档中有这样的示例,但是我认为它确实值得进行堆栈溢出问答。

Working with cursors 使用游标

There is a better way! 有一个更好的方法! You can write this much more concisely while also maintaining useful error messages by working directly with cursors all the way through: 通过直接使用游标,您可以更加简洁地编写代码,同时还可以维护有用的错误消息:

case class Foo(firstName: String, lastName: String, age: Int, stuff: List[Boolean])

import cats.syntax.either._
import io.circe.Decoder

implicit val fooDecoder: Decoder[Foo] = Decoder.instance { c =>
  val fnC = c.downArray

  for {
    fn     <- fnC.as[String]
    lnC     = fnC.deleteGoRight
    ln     <- lnC.as[String]
    ageC    = lnC.deleteGoLast
    age    <- ageC.as[Int]
    stuffC  = ageC.delete
    stuff  <- stuffC.as[List[Boolean]]
  } yield Foo(fn, ln, age, stuff)
}

This also works: 这也适用:

scala> fooDecoder.decodeJson(json"""[ "Foo", "McBar", true, false, 137 ]""")
res0: io.circe.Decoder.Result[Foo] = Right(Foo(Foo,McBar,137,List(true, false)))

But it also gives us an indication of where errors happened: 但这也给我们指明了错误发生的位置:

scala> fooDecoder.decodeJson(json"""[ "Foo", "McBar", true, false ]""")
res1: io.circe.Decoder.Result[Foo] = Left(DecodingFailure(Int, List(DeleteGoLast, DeleteGoRight, DownArray)))

Also it's shorter, more declarative, and doesn't require that unreadable nesting. 而且它更短,更具声明性,并且不需要那种不可读的嵌套。

How it works 怎么运行的

The key idea is that we interleave "reading" operations (the .as[X] calls on the cursor) with navigation / modification operations ( downArray and the three delete method calls). 关键思想是,我们将“读取”操作(光标上的.as[X]调用)与导航/修改操作( downArray和三个delete方法调用)进行downArray

When we start, c is an HCursor that we hope points at an array. 当我们开始时, c是我们希望指向数组的HCursor c.downArray moves the cursor to the first element in the array. c.downArray将光标移动到数组中的第一个元素。 If the input isn't an array at all, or is an empty array, this operation will fail, and we'll get a useful error message. 如果输入根本不是数组或为空数组,则此操作将失败,并且我们将收到一条有用的错误消息。 If it succeeds, the first line of the for -comprehension will try to decode that first element into a string, and leaves our cursor pointing at that first element. 如果成功, for -comprehension的第一行将尝试将第一个元素解码为字符串,并使光标指向该第一个元素。

The second line in the for -comprehension says "okay, we're done with the first element, so let's forget about it and move to the second". for -comprehension中的第二行表示“好的,我们已经完成了第一个元素,因此让我们忘记它,然后移至第二个。” The delete part of the method name doesn't mean it's actually mutating anything—nothing in circe ever mutates anything in any way that users can observe—it just means that that element won't be available to any future operations on the resulting cursor. 方法名称的delete部分并不意味着它实际上是在改变任何东西-几乎没有任何改变以用户可以观察到的任何方式-只是意味着该元素将不能再用于结果游标上的任何将来的操作。

The third line tries to decode the second element in the original JSON array (now the first element in our new cursor) as a string. 第三行尝试将原始JSON数组中的第二个元素(现在为新游标中的第一个元素)解码为字符串。 When that's done, the fourth line "deletes" that element and moves to the end of the array, and then the fifth line tries to decode that final element as an Int . 完成此操作后,第四行“删除”该元素并移至数组的末尾,然后第五行尝试将最后一个元素解码为Int

The next line is probably the most interesting: 下一行可能是最有趣的:

    stuffC  = ageC.delete

This says, okay, we're at the last element in our modified view of the JSON array (where earlier we deleted the first two elements). 这就是说,好的,我们位于JSON数组修改视图的最后一个元素(之前删除了前两个元素)。 Now we delete the last element and move the cursor up so that it points at the entire (modified) array, which we can then decode as a list of booleans, and we're done. 现在,我们删除最后一个元素并将光标向上移动,使其指向整个(已修改的)数组,然后可以将其解码为布尔值列表,然后完成。

More error accumulation 更多错误累积

There's actually an even more concise way you can write this: 实际上,您可以编写出一种更为简洁的方法:

import cats.syntax.all._
import io.circe.Decoder

implicit val fooDecoder: Decoder[Foo] = (
  Decoder[String].prepare(_.downArray),
  Decoder[String].prepare(_.downArray.deleteGoRight),
  Decoder[Int].prepare(_.downArray.deleteGoLast),
  Decoder[List[Boolean]].prepare(_.downArray.deleteGoRight.deleteGoLast.delete)
).map4(Foo)

This will also work, and it has the added benefit that if decoding would fail for more than one of the members, you can get error messages for all of the failures at the same time. 这也将起作用,并且具有额外的好处,即如果解码将导致多个成员中的一个失败,则可以同时获取所有失败的错误消息。 For example, if we have something like this, we should expect three errors (for the non-string first name, the non-integral age, and the non-boolean stuff value): 例如,如果我们有类似这样的内容,那么我们应该预料到三个错误(非字符串名字,非整数年龄和非布尔值):

val bad = """[["Foo"], "McBar", true, "true", false, 13.7 ]"""

val badResult = io.circe.jawn.decodeAccumulating[Foo](bad)

And that's what we see (together with the specific location information for each failure): 这就是我们所看到的(连同每个故障的特定位置信息):

scala> badResult.leftMap(_.map(println))
DecodingFailure(String, List(DownArray))
DecodingFailure(Int, List(DeleteGoLast, DownArray))
DecodingFailure([A]List[A], List(MoveRight, DownArray, DeleteGoParent, DeleteGoLast, DeleteGoRight, DownArray))

Which of these two approaches you should prefer is a matter of taste and whether or not you care about error accumulating—I personally find the first a little more readable. 您应该采用这两种方法中的哪一种取决于您的口味,以及您是否关心错误累积—我个人认为第一种方法更具可读性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM