简体   繁体   English

从 json 事件创建 AWS Athena 表

[英]Create AWS Athena table from json event

I am trying to create an AWS Athena table from a json event file stored in S3.我正在尝试从存储在 S3 中的 json 事件文件创建 AWS Athena 表。 I seem to be having trouble with format of my json event.我的 json 事件格式似乎有问题。 The event is delivered in this format:事件以这种格式传递:

"[{\\"String1\\":123,\\"String2\\":\\"abc\\",\\"String3\\":\\"def\\"}]"

When I create the table it doesn't show any data as I don't think it can read the json string.当我创建表时,它不显示任何数据,因为我认为它无法读取 json 字符串。 My table creation code is:我的表创建代码是:

CREATE EXTERNAL TABLE events (
String1 string,
String2 string,
String3 string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://events/';

I'm pretty sure I need other config to have the format of my json event to be parsed correctly, but I'm not sure what.我很确定我需要其他配置才能正确解析我的 json 事件的格式,但我不确定是什么。 If I set my json event to the below, it all creates and works as intended.如果我将我的 json 事件设置为下面的,它会按预期创建和工作。

{"String1":"123","String2":"abc","String3":"def"}

Does anyone have any pointers as to what I need to do for my file format to be read/parsed correctly?有没有人有任何关于我需要做什么才能正确读取/解析我的文件格式的指示?

Thanks.谢谢。


UPDATE更新

I've managed to get my json data delivered without the \\ , so now just need to handle the start and ending brackets [...] ,我已经设法在没有\\情况下交付了我的 json 数据,所以现在只需要处理开始和结束括号[...]

[{"String1":123,"String2":"abc","String3":"def"}]

as this also causes an issue with my table, as puts all the data into the first row.因为这也会导致我的表出现问题,因为将所有数据放入第一行。 Without the [...] it is placed correctly.没有 [...] 它被正确放置。 I think I need to use an array, so looking at that.我想我需要使用一个数组,所以看看那个。

I've resolved this using an array, as most will have expected.正如大多数人所预料的那样,我已经使用数组解决了这个问题。

My Athena create table query was therefore:因此,我的 Athena 创建表查询是:

CREATE EXTERNAL TABLE events (
`details` array<struct<
String1:string,
String2:string,
String3:string >>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://events/';

This gave me a parsed output that I could then run select's on or create a view to show my data in table format:这给了我一个解析的输出,然后我可以运行选择或创建一个视图以表格格式显示我的数据:

CREATE OR REPLACE VIEW "v_events" AS
SELECT
  item.string1,
  item.string2,
  item.string3
FROM
  (events
CROSS JOIN UNNEST("details") t (item))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM