简体   繁体   English

有没有办法用 Ruby 中的尾随逗号解析 JSON?

[英]Is there any way to parse JSON with trailing commas in Ruby?

I'm currently coding a transition from a system that used hand-crafted JSON files to one that can automatically generate the JSON files.我目前正在编写从使用手工制作的 JSON 文件的系统到可以自动生成 JSON 文件的系统的转换。 The old system works;旧系统有效; the new system works;新系统有效; what I need to do is transfer data from the old system to the new one.我需要做的是将数据从旧系统传输到新系统。

The JSON files are used by an iOS app to provide functionality, and have never been read by our server software in Ruby On Rails before. JSON 文件被 iOS 应用程序用于提供功能,并且之前从未被我们的服务器软件在 Ruby On Rails 中读取。 To convert between the original system and the new system, I've started work on parsing the existing JSON files.为了在原始系统和新系统之间进行转换,我已经开始解析现有的 JSON 文件。

The problem is that one of my first two sample files has trailing commas in the JSON:问题是我的前两个示例文件之一在 JSON 中有尾随逗号:

{ "sample data": [1, 2, 3,] }

This apparently went through just fine with the iOS app, because that file has been in use for a while.这显然与 iOS 应用程序配合得很好,因为该文件已经使用了一段时间。 Now I need some way to parse the data provided in the file in my Ruby on Rails server, which (quite rightfully) throws an exception over the illegal trailing comma in the JSON file.现在我需要一些方法来解析我在 Rails 服务器上的 Ruby 文件中提供的数据,这(非常正确地)在 JSON 文件中的非法尾随逗号引发异常。

I can't just JSON.parse the code, because the parser, quite rightfully, rejects it as invalid JSON.我不能只是 JSON.parse 代码,因为解析器非常正确地拒绝它为无效的 JSON。 Is there some way to parse it -- either an option I can pass to JSON.parse, or a gem that adds something, etc etc?有什么方法可以解析它——我可以传递给 JSON.parse 的选项,或者添加一些东西的 gem 等等? Or do I need to report back that we're going to have to hand-fix the broken files before the automated process can process them?或者我是否需要报告我们将不得不在自动化流程处理损坏的文件之前手动修复它们?

Edit:编辑:

Based on comments and requests, it looks like some additional data is called for.根据评论和请求,似乎需要一些额外的数据。 The JSON files in question are stored in.zip files on S3, stored via ActiveStorage.有问题的 JSON 文件存储在 S3 上的 zip 文件中,通过 ActiveStorage 存储。 The process I'm writing needs to download, unpack, and parse the zip files, using the 'manifest.json' file as a key to convert the archived file into a database structure with multiple, smaller files stored on S3 instead of a single zip that contains everything.我正在编写的过程需要下载、解压缩和解析 zip 文件,使用“manifest.json”文件作为将存档文件转换为数据库结构的密钥,其中存储在 S3 上的多个较小文件而不是单个文件zip 包含所有内容。 A (very) long term goal is for clients to stop downloading a unitary zip file, and instead download the files individually.一个(非常)长期的目标是让客户停止下载单一的 zip 文件,而是单独下载文件。 The first step towards that is to break the zip files up on the server , which means the server needs to read in the zip files.第一步是在服务器上破坏 zip 文件,这意味着服务器需要读取 zip 文件。 A more detailed sample of the data follows.下面是更详细的数据样本。 (Note that the structure contains several design decisions I later came to regret; one of the original ideas was to be able to re-use files rather than pack multiple copies of the same identical file, but YAGNI bit me in the rear there) (请注意,该结构包含几个我后来后悔的设计决策;最初的想法之一是能够重用文件,而不是打包同一个文件的多个副本,但 YAGNI 在后面咬了我一口)

The following includes comments that are not legal in JSON format:以下包括在 JSON 格式中不合法的注释:

{
  "defined_key": [
    {
      "name": "Object_with_subkeys",
      "key": "filename",
      "subkeys": [
        {
          "id":"1"
        },
        {
          "id":"2"
        },
        {
          "id":"3" // references to identifier on another defined key
        }, // Note trailing comma
      ]
    }
  ],
  "another_defined_key":[
    {
      "identifier": "should have made parent a hash with id as key instead of an array",
      "data":"metadata",
      "display_name":"Names: Can be very arbitrary",
      "user text":"Wait for the right {moment}", // I actually don't expect { or } in the strings, but they're completely legal and may have been used
      "thumbnail":"filename-2.png",
      "video-1":"filename-3.mov"
    }
  ]
}

The problem is that your are trying to parse something that looks a lot like JSON but is not actually JSON as defined by the spec.问题是您正在尝试解析看起来很像 JSON 但实际上不是规范定义的 JSON 的东西。

Arrays- An array structure is a pair of square bracket tokens surrounding zero or more values.数组 - 数组结构是一对围绕零个或多个值的方括号标记。 The values are separated by commas.这些值用逗号分隔。

Since you have a trailing comma another value is also expected and most JSON parsers will raise an error due to this violation由于您有一个尾随逗号,因此还需要另一个值,并且大多数 JSON 解析器将由于此违规而引发错误

All that being said json-next will parse this appropriately maybe give that a shot.所有所说json-next都会适当地解析这个,也许可以试一试。

It can parse JSON like representations that completely violate the JSON spec depending on the flavor you use.它可以解析 JSON 之类的表示完全违反 JSON 规范,具体取决于您使用的风格。 (HanSON, SON, JSONX as defined in the gem) (Gem 中定义的 HanSON、SON、JSONX)

Example:例子:

json = "{ \"sample data\": [1, 2, 3,] }")
require 'json/next'
HANSON.parse(json)
#=> {"sample data"=>[1, 2, 3]}

but the following is equivalent and completely violates spec但以下是等效的并且完全违反规范

JSONX.parse("{ \"sample data\": [1 2 3] }")
#=> {"sample data"=>[1, 2, 3]} 

So if you choose this route do not expect to use this to validate the JSON data or structure in any fashion and you could end up with unintended results.因此,如果您选择这条路线,不要期望以任何方式使用它来验证 JSON 数据或结构,您最终可能会得到意想不到的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM