繁体   English   中英

从嵌套的 JSON 在 Athena 中创建表

[英]Create Table in Athena From Nested JSON

我有嵌套的 JSON 类型

[{
    "emails": [{
        "label": "",
        "primary": "",
        "relationdef_id": "",
        "type": "",
        "value": ""
    }],
    "licenses": [{
        "allocated": "",
        "parent_type": "",
        "parentid": "",
        "product_type": "",
        "purchased_license_id": "",
        "service_type": ""
    }, {
        "allocated": "",
        "parent_type": "",
        "parentid": "",
        "product_type": "",
        "purchased_license_id": "",
        "service_type": ""
    }]
}, {
    "emails": [{
        "label": "",
        "primary": "",
        "relationdef_id": "",
        "type": "",
        "value": ""
    }],
    "licenses": [{
        "allocated": "2016-04-26 01:46:26",
        "parent_type": "",
        "parentid": "",
        "product_type": "",
        "purchased_license_id": "",
        "service_type": ""
    }]
}]

无法转换为雅典娜表。

我也尝试将其更新为对象列表

{
        "emails": [{
                "label": "",
                "primary": "",
                "relationdef_id": "",
                "type": "",
                "value": ""
            }
        ],
        "licenses": [{
                "allocated": "",
                "parent_type": "",
                "parentid": "",
                "product_type": "",
                "purchased_license_id": "",
                "service_type": ""
            },{
                "allocated": "",
                "parent_type": "",
                "parentid": "",
                "product_type": "",
                "purchased_license_id": "",
                "service_type": ""
            }
        ]
    }
    {
        "emails": [{
                "label": "",
                "primary": "",
                "relationdef_id": "",
                "type": "",
                "value": ""
            }
        ],
        "licenses": [{
                "allocated": "",
                "parent_type": "",
                "parentid": "",
                "product_type": "",
                "purchased_license_id": "",
                "service_type": ""
            }
        ]
    }
    {
        "emails": [{
                "label": "",
                "primary": "",
                "relationdef_id": "",
                "type": "",
                "value": ""
            }
        ],
        "licenses": [{
                "allocated": "",
                "parent_type": "",
                "parentid": "",
                "product_type": "",
                "purchased_license_id": "",
                "service_type": ""
            }
        ]
    }

带查询:

CREATE EXTERNAL TABLE `test_orders1`(
  `emails` array<struct<`label`: string, `primary`: string,`relationdef_id`: string,`type`: string, `value`: string>>,
  `licenses` array<struct<`allocated`: string, `parent_type`: string, `parentid`: string, `product_type`: string,`purchased_license_id`: string, `service_type`: string>>) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( 'ignore.malformed.json' = 'true')
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

但只形成了 1 行。 有没有办法可以将 JSONArray 类型的嵌套 json 使用到 Athena 表中? 或者如何更改对我有用的嵌套 Json?

查询 JSON 数据时,Athena 要求将文件格式化为每行一个 JSON 文档。 从您的问题中不清楚是否是这种情况,您提供的示例是多行的,但这也许只是为了使问题更清楚。

您包含的表 DDL 看起来应该适用于第二个示例数据,前提是它的格式为每行一个文档,例如

{"emails": [{"label": "", "primary": "", "relationdef_id": "", "type": "", "value": ""}], "licenses": [{"allocated": "", "parent_type": "", "parentid": "", "product_type": "", "purchased_license_id": "", "service_type": ""}, { "allocated": "", "parent_type": "", "parentid": "", "product_type": "", "purchased_license_id": "", "service_type": ""}]}
{"emails": [{"label": "", "primary": "", "relationdef_id": "", "type": "", "value": ""}], "licenses": [{"allocated": "", "parent_type": "", "parentid": "", "product_type": "", "purchased_license_id": "", "service_type": ""}]}
{"emails": [{"label": "", "primary": "", "relationdef_id": "", "type": "", "value": ""}], "licenses": [{"allocated": "", "parent_type": "", "parentid": "", "product_type": "", "purchased_license_id": "", "service_type": ""}]}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM