简体   繁体   English

将大型多嵌套json解析为scala案例类

[英]parsing a large multi nested json into scala case class

Below is a json example of Twitter's tweet. 下面是Twitter推文的json示例。 It's a large json. 这是一个大型的json。 What is the best library/method to parse it into a case class in scala? 在scala中将其解析为case类的最佳库/方法是什么?

For instance, in Play Framework 2.x it's possible to do that with it's internal library by defining case classes and implicit conversions, but in this case I don't use Play. 例如,在Play Framework 2.x中,可以通过定义案例类和隐式转换来使用它的内部库来实现,但在这种情况下我不使用Play。 Should I? 我是不是该?

spray-json seems to be most popular scala json library, but in this case it looks quite disappointing - standard approach seems to be limited to 22 elements and uses pattern matching, which becomes ridiculous in the context of multi nested structure with hundreds of elements. spray-json似乎是最流行的scala json库,但在这种情况下它看起来相当令人失望 - 标准方法似乎仅限于22个元素并使用模式匹配,这在具有数百个元素的多嵌套结构的环境中变得荒谬。 Any ideas? 有任何想法吗?

{
  "created_at": "Sat Oct 24 06:44:34 +0000 2015",
  "id": 657809891558576132,
  "id_str": "657809891558576132",
  "text": "RT @M23projects: Kara Walker \"Go to Hell or Atlanta, Whichever Come First\" @victoriamiro   #London https://t.co/HapqKa4i0l https://t.co/95G…",
  "source": "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
  "truncated": false,
  "in_reply_to_status_id": null,
  "in_reply_to_status_id_str": null,
  "in_reply_to_user_id": null,
  "in_reply_to_user_id_str": null,
  "in_reply_to_screen_name": null,
  "user": {
    "id": 2792146884,
    "id_str": "2792146884",
    "name": "Tonbridge School Art",
    "screen_name": "ArtTonSchool",
    "location": "Tonbridge",
    "url": null,
    "description": "Tonbridge School is an independent day and boarding school for boys. Tweets by the Art Department.",
    "protected": false,
    "verified": false,
    "followers_count": 187,
    "friends_count": 288,
    "listed_count": 10,
    "favourites_count": 1069,
    "statuses_count": 1764,
    "created_at": "Fri Sep 05 15:37:43 +0000 2014",
    "utc_offset": 3600,
    "time_zone": "London",
    "geo_enabled": true,
    "lang": "en-gb",
    "contributors_enabled": false,
    "is_translator": false,
    "profile_background_color": "C0DEED",
    "profile_background_image_url": "http://abs.twimg.com/images/themes/theme1/bg.png",
    "profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme1/bg.png",
    "profile_background_tile": false,
    "profile_link_color": "0084B4",
    "profile_sidebar_border_color": "C0DEED",
    "profile_sidebar_fill_color": "DDEEF6",
    "profile_text_color": "333333",
    "profile_use_background_image": true,
    "profile_image_url": "http://pbs.twimg.com/profile_images/507921409738543104/V35eZACR_normal.jpeg",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/507921409738543104/V35eZACR_normal.jpeg",
    "profile_banner_url": "https://pbs.twimg.com/profile_banners/2792146884/1410119421",
    "default_profile": true,
    "default_profile_image": false,
    "following": null,
    "follow_request_sent": null,
    "notifications": null
  },
  "geo": null,
  "coordinates": null,
  "place": null,
  "contributors": null,
  "retweeted_status": {
    "created_at": "Sat Oct 24 02:27:06 +0000 2015",
    "id": 657745100739506176,
    "id_str": "657745100739506176",
    "text": "Kara Walker \"Go to Hell or Atlanta, Whichever Come First\" @victoriamiro   #London https://t.co/HapqKa4i0l https://t.co/95GaLC4XTo",
    "source": "<a href=\"http://twitter.com/download/iphone\" rel=\"nofollow\">Twitter for iPhone</a>",
    "truncated": false,
    "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "in_reply_to_user_id_str": null,
    "in_reply_to_screen_name": null,
    "user": {
      "id": 999716342,
      "id_str": "999716342",
      "name": "M23",
      "screen_name": "M23projects",
      "location": "New York",
      "url": "http://M23.co",
      "description": "M23's project space + itinerant program promotes new work by new artists. \nhttp://Instagram.com/m23projects",
      "protected": false,
      "verified": false,
      "followers_count": 9150,
      "friends_count": 7353,
      "listed_count": 174,
      "favourites_count": 1354,
      "statuses_count": 4666,
      "created_at": "Sun Dec 09 17:13:35 +0000 2012",
      "utc_offset": -14400,
      "time_zone": "Eastern Time (US & Canada)",
      "geo_enabled": true,
      "lang": "en",
      "contributors_enabled": false,
      "is_translator": false,
      "profile_background_color": "547587",
      "profile_background_image_url": "http://pbs.twimg.com/profile_background_images/884257252/e329bbc1b91d695862d5b23a209f2d34.jpeg",
      "profile_background_image_url_https": "https://pbs.twimg.com/profile_background_images/884257252/e329bbc1b91d695862d5b23a209f2d34.jpeg",
      "profile_background_tile": true,
      "profile_link_color": "414A4D",
      "profile_sidebar_border_color": "FFFFFF",
      "profile_sidebar_fill_color": "DDEEF6",
      "profile_text_color": "333333",
      "profile_use_background_image": true,
      "profile_image_url": "http://pbs.twimg.com/profile_images/458985956830236673/Z_4Bq9PJ_normal.jpeg",
      "profile_image_url_https": "https://pbs.twimg.com/profile_images/458985956830236673/Z_4Bq9PJ_normal.jpeg",
      "profile_banner_url": "https://pbs.twimg.com/profile_banners/999716342/1398650659",
      "default_profile": false,
      "default_profile_image": false,
      "following": null,
      "follow_request_sent": null,
      "notifications": null
    },
    "geo": null,
    "coordinates": null,
    "place": null,
    "contributors": null,
    "is_quote_status": false,
    "retweet_count": 2,
    "favorite_count": 3,
    "entities": {
      "hashtags": [
        {
          "text": "London",
          "indices": [
            74,
            81
          ]
        }
      ],
      "urls": [
        {
          "url": "https://t.co/HapqKa4i0l",
          "expanded_url": "http://instagram.com/m23projects",
          "display_url": "instagram.com/m23projects",
          "indices": [
            82,
            105
          ]
        }
      ],
      "user_mentions": [
        {
          "screen_name": "victoriamiro",
          "name": "Victoria Miro",
          "id": 373924746,
          "id_str": "373924746",
          "indices": [
            58,
            71
          ]
        }
      ],
      "symbols": [],
      "media": [
        {
          "id": 657745078413201408,
          "id_str": "657745078413201408",
          "indices": [
            106,
            129
          ],
          "media_url": "http://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
          "media_url_https": "https://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
          "url": "https://t.co/95GaLC4XTo",
          "display_url": "pic.twitter.com/95GaLC4XTo",
          "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
          "type": "photo",
          "sizes": {
            "small": {
              "w": 340,
              "h": 255,
              "resize": "fit"
            },
            "medium": {
              "w": 600,
              "h": 450,
              "resize": "fit"
            },
            "thumb": {
              "w": 150,
              "h": 150,
              "resize": "crop"
            },
            "large": {
              "w": 1024,
              "h": 768,
              "resize": "fit"
            }
          }
        }
      ]
    },
    "extended_entities": {
      "media": [
        {
          "id": 657745078413201408,
          "id_str": "657745078413201408",
          "indices": [
            106,
            129
          ],
          "media_url": "http://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
          "media_url_https": "https://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
          "url": "https://t.co/95GaLC4XTo",
          "display_url": "pic.twitter.com/95GaLC4XTo",
          "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
          "type": "photo",
          "sizes": {
            "small": {
              "w": 340,
              "h": 255,
              "resize": "fit"
            },
            "medium": {
              "w": 600,
              "h": 450,
              "resize": "fit"
            },
            "thumb": {
              "w": 150,
              "h": 150,
              "resize": "crop"
            },
            "large": {
              "w": 1024,
              "h": 768,
              "resize": "fit"
            }
          }
        },
        {
          "id": 657745085275095040,
          "id_str": "657745085275095040",
          "indices": [
            106,
            129
          ],
          "media_url": "http://pbs.twimg.com/media/CSDHq5CUwAAC-6a.jpg",
          "media_url_https": "https://pbs.twimg.com/media/CSDHq5CUwAAC-6a.jpg",
          "url": "https://t.co/95GaLC4XTo",
          "display_url": "pic.twitter.com/95GaLC4XTo",
          "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
          "type": "photo",
          "sizes": {
            "small": {
              "w": 340,
              "h": 453,
              "resize": "fit"
            },
            "medium": {
              "w": 600,
              "h": 800,
              "resize": "fit"
            },
            "thumb": {
              "w": 150,
              "h": 150,
              "resize": "crop"
            },
            "large": {
              "w": 768,
              "h": 1024,
              "resize": "fit"
            }
          }
        },
        {
          "id": 657745085300277248,
          "id_str": "657745085300277248",
          "indices": [
            106,
            129
          ],
          "media_url": "http://pbs.twimg.com/media/CSDHq5IVAAAn2YH.jpg",
          "media_url_https": "https://pbs.twimg.com/media/CSDHq5IVAAAn2YH.jpg",
          "url": "https://t.co/95GaLC4XTo",
          "display_url": "pic.twitter.com/95GaLC4XTo",
          "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
          "type": "photo",
          "sizes": {
            "small": {
              "w": 340,
              "h": 453,
              "resize": "fit"
            },
            "medium": {
              "w": 600,
              "h": 800,
              "resize": "fit"
            },
            "thumb": {
              "w": 150,
              "h": 150,
              "resize": "crop"
            },
            "large": {
              "w": 768,
              "h": 1024,
              "resize": "fit"
            }
          }
        },
        {
          "id": 657745085275082752,
          "id_str": "657745085275082752",
          "indices": [
            106,
            129
          ],
          "media_url": "http://pbs.twimg.com/media/CSDHq5CUkAAd0oL.jpg",
          "media_url_https": "https://pbs.twimg.com/media/CSDHq5CUkAAd0oL.jpg",
          "url": "https://t.co/95GaLC4XTo",
          "display_url": "pic.twitter.com/95GaLC4XTo",
          "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
          "type": "photo",
          "sizes": {
            "small": {
              "w": 340,
              "h": 255,
              "resize": "fit"
            },
            "medium": {
              "w": 600,
              "h": 450,
              "resize": "fit"
            },
            "thumb": {
              "w": 150,
              "h": 150,
              "resize": "crop"
            },
            "large": {
              "w": 1024,
              "h": 768,
              "resize": "fit"
            }
          }
        }
      ]
    },
    "favorited": false,
    "retweeted": false,
    "possibly_sensitive": false,
    "filter_level": "low",
    "lang": "en"
  },
  "is_quote_status": false,
  "retweet_count": 0,
  "favorite_count": 0,
  "entities": {
    "hashtags": [
      {
        "text": "London",
        "indices": [
          91,
          98
        ]
      }
    ],
    "urls": [
      {
        "url": "https://t.co/HapqKa4i0l",
        "expanded_url": "http://instagram.com/m23projects",
        "display_url": "instagram.com/m23projects",
        "indices": [
          99,
          122
        ]
      }
    ],
    "user_mentions": [
      {
        "screen_name": "M23projects",
        "name": "M23",
        "id": 999716342,
        "id_str": "999716342",
        "indices": [
          3,
          15
        ]
      },
      {
        "screen_name": "victoriamiro",
        "name": "Victoria Miro",
        "id": 373924746,
        "id_str": "373924746",
        "indices": [
          75,
          88
        ]
      }
    ],
    "symbols": [],
    "media": [
      {
        "id": 657745078413201408,
        "id_str": "657745078413201408",
        "indices": [
          123,
          140
        ],
        "media_url": "http://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
        "media_url_https": "https://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
        "url": "https://t.co/95GaLC4XTo",
        "display_url": "pic.twitter.com/95GaLC4XTo",
        "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
        "type": "photo",
        "sizes": {
          "small": {
            "w": 340,
            "h": 255,
            "resize": "fit"
          },
          "medium": {
            "w": 600,
            "h": 450,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "large": {
            "w": 1024,
            "h": 768,
            "resize": "fit"
          }
        },
        "source_status_id": 657745100739506176,
        "source_status_id_str": "657745100739506176",
        "source_user_id": 999716342,
        "source_user_id_str": "999716342"
      }
    ]
  },
  "extended_entities": {
    "media": [
      {
        "id": 657745078413201408,
        "id_str": "657745078413201408",
        "indices": [
          123,
          140
        ],
        "media_url": "http://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
        "media_url_https": "https://pbs.twimg.com/media/CSDHqfeUkAA4a0Y.jpg",
        "url": "https://t.co/95GaLC4XTo",
        "display_url": "pic.twitter.com/95GaLC4XTo",
        "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
        "type": "photo",
        "sizes": {
          "small": {
            "w": 340,
            "h": 255,
            "resize": "fit"
          },
          "medium": {
            "w": 600,
            "h": 450,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "large": {
            "w": 1024,
            "h": 768,
            "resize": "fit"
          }
        },
        "source_status_id": 657745100739506176,
        "source_status_id_str": "657745100739506176",
        "source_user_id": 999716342,
        "source_user_id_str": "999716342"
      },
      {
        "id": 657745085275095040,
        "id_str": "657745085275095040",
        "indices": [
          123,
          140
        ],
        "media_url": "http://pbs.twimg.com/media/CSDHq5CUwAAC-6a.jpg",
        "media_url_https": "https://pbs.twimg.com/media/CSDHq5CUwAAC-6a.jpg",
        "url": "https://t.co/95GaLC4XTo",
        "display_url": "pic.twitter.com/95GaLC4XTo",
        "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
        "type": "photo",
        "sizes": {
          "small": {
            "w": 340,
            "h": 453,
            "resize": "fit"
          },
          "medium": {
            "w": 600,
            "h": 800,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "large": {
            "w": 768,
            "h": 1024,
            "resize": "fit"
          }
        },
        "source_status_id": 657745100739506176,
        "source_status_id_str": "657745100739506176",
        "source_user_id": 999716342,
        "source_user_id_str": "999716342"
      },
      {
        "id": 657745085300277248,
        "id_str": "657745085300277248",
        "indices": [
          123,
          140
        ],
        "media_url": "http://pbs.twimg.com/media/CSDHq5IVAAAn2YH.jpg",
        "media_url_https": "https://pbs.twimg.com/media/CSDHq5IVAAAn2YH.jpg",
        "url": "https://t.co/95GaLC4XTo",
        "display_url": "pic.twitter.com/95GaLC4XTo",
        "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
        "type": "photo",
        "sizes": {
          "small": {
            "w": 340,
            "h": 453,
            "resize": "fit"
          },
          "medium": {
            "w": 600,
            "h": 800,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "large": {
            "w": 768,
            "h": 1024,
            "resize": "fit"
          }
        },
        "source_status_id": 657745100739506176,
        "source_status_id_str": "657745100739506176",
        "source_user_id": 999716342,
        "source_user_id_str": "999716342"
      },
      {
        "id": 657745085275082752,
        "id_str": "657745085275082752",
        "indices": [
          123,
          140
        ],
        "media_url": "http://pbs.twimg.com/media/CSDHq5CUkAAd0oL.jpg",
        "media_url_https": "https://pbs.twimg.com/media/CSDHq5CUkAAd0oL.jpg",
        "url": "https://t.co/95GaLC4XTo",
        "display_url": "pic.twitter.com/95GaLC4XTo",
        "expanded_url": "http://twitter.com/M23projects/status/657745100739506176/photo/1",
        "type": "photo",
        "sizes": {
          "small": {
            "w": 340,
            "h": 255,
            "resize": "fit"
          },
          "medium": {
            "w": 600,
            "h": 450,
            "resize": "fit"
          },
          "thumb": {
            "w": 150,
            "h": 150,
            "resize": "crop"
          },
          "large": {
            "w": 1024,
            "h": 768,
            "resize": "fit"
          }
        },
        "source_status_id": 657745100739506176,
        "source_status_id_str": "657745100739506176",
        "source_user_id": 999716342,
        "source_user_id_str": "999716342"
      }
    ]
  },
  "favorited": false,
  "retweeted": false,
  "possibly_sensitive": false,
  "filter_level": "low",
  "lang": "en",
  "timestamp_ms": "1445669074321"
}

**UPDATE: ** I guess I should stick to play-json , even more so for performance reasons - http://derekwyatt.org/2014/01/15/benchmarking-spray-json-argonaut-play-json/ **更新:**我想我应该坚持使用play-json ,因为性能原因更是如此 - http://derekwyatt.org/2014/01/15/benchmarking-spray-json-argonaut-play-json/

You can depends on Play's JSON library by itself: 您可以单独依赖Play的JSON库:

// build.sbt
libraryDependencies += "com.typesafe.play" % "play-json_2.11" % "X.X.X"

// Tweet.scala
import play.api.libs.json._

case class User(id: String, name: String, ...)
implicit val userFormat = Json.format[User]

case class Tweet(id: String, content: String, user: User)
implicit val tweetFormat = Json.format[Tweet]

This will use play-json's macros to auto-generate the formatters you need to parse JSON into instances of Tweet and User . 这将使用play-json的宏来自动生成将JSON解析为TweetUser实例所需的格式化程序。

Regardless of library you choose you won't find an elegant solution for handling more than 22 fields since that's a limitation of the case class implementation (up until 2.11) rather than any specific design choice by a library. 无论您选择哪种库,您都无法找到处理超过22个字段的优雅解决方案,因为这是案例类实现的限制(直到2.11),而不是库的任何特定设计选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM