[英]aws athena query json array data
我無法使用 Aws Athena 查詢 S3 文件,文件的內容是常規的 json arrays,如下所示:
[
{
"DataInvio": "2020-02-06T13:37:00+00:00",
"DataLettura": "2020-02-06T13:35:50+00:00",
"FlagDownloaded": 0,
"GUID": "f257c9c0-b7e1-4663-8d6d-97e652b27c10",
"IMEI": "866100000062167",
"Id": 0,
"IdSessione": "4bd169ff-307c-4fbf-aa63-fce972f43fa2",
"IdTagLocal": 0,
"SerialNumber": "142707160028BJZZZZ",
"Tag": "E200001697080089188056D2",
"Tipo": "B",
"TipoEvento": "L",
"TipoSegnalazione": 0,
"TipoTag": "C",
"UsrId": "10642180-1e34-44ac-952e-9cb3e8e6a03c"
},
{
"DataInvio": "2020-02-06T13:37:00+00:00",
"DataLettura": "2020-02-06T13:35:50+00:00",
"FlagDownloaded": 0,
"GUID": "e531272e-465c-4294-950d-95a683ff8e3b",
"IMEI": "866100000062167",
"Id": 0,
"IdSessione": "4bd169ff-307c-4fbf-aa63-fce972f43fa2",
"IdTagLocal": 0,
"SerialNumber": "142707160028BJZZZZ",
"Tag": "E200341201321E0000A946D2",
"Tipo": "B",
"TipoEvento": "L",
"TipoSegnalazione": 0,
"TipoTag": "C",
"UsrId": "10642180-1e34-44ac-952e-9cb3e8e6a03c"
}
]
如果以這種方式生成表,則select * from mytable
返回空行
CREATE EXTERNAL TABLE IF NOT EXISTS mydb.mytable (
`IdSessione` string,
`DataLettura` date,
`GUID` string,
`DataInvio` date
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'ignore.malformed.json' = 'true'
) LOCATION 's3://athenatestsavino/files/anthea/'
TBLPROPERTIES ('has_encrypted_data'='false')
或者它給我一個錯誤HIVE_CURSOR_ERROR: Row is not a valid JSON Object - JSONException: Missing value at 1 [character 2 line 1]
如果表是用以下方法生成的:
CREATE EXTERNAL TABLE IF NOT EXISTS mydb.mytable(
`IdSessione` string,
`DataLettura` date,
`GUID` string,
`DataInvio` date
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
) LOCATION 's3://athenatestsavino/files/anthea/'
TBLPROPERTIES ('has_encrypted_data'='false')
如果我以這種方式修改文件的內容(json object 每行沒有尾隨逗號,查詢會給我結果)
{ "DataInvio": "2020-02-06T13:37:00+00:00", "DataLettura": "2020-02-06T13:35:50+00:00",....}
{ "DataInvio": "2020-02-07T13:37:00+00:00", "DataLettura": "2020-02-06T13:35:50+00:00",....}
如何直接查詢json數組結構?
這與 JSON 對象的格式有關。 此處還描述了這些問題的解決方案: https://aws.amazon.com/premiumsupport/knowledge-center/error-json-athena/
除此之外,如果您使用 AWS Glue 來爬取這些文件,請確保 Data Catalog 的數據庫表分類不是“未知”。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.