We are using Kinesis Firehose to push data to s3 and to Redshift. We are pushing the whole object in s3 and only pushing a subset of fields to redshift.
Here is an example of the object we are currently pushing to Firehose.
[
{
field1: 1,
field2: 1,
arr: [
{inner_field1: 1, inner_field2: 1},
{inner_field1: 1, inner_field2: 1}
]
},
...
]
Right now only field1
and field2
are pushed to redshift but we would also want to push the arr
field to Redshift.
First option we thought about is to use the new SUPER type, but I didn't find any documentation on how to push SUPER type object from firehose to redshift.
Second option (and preferred in our case) is to flatten the structure prior to pushing in Redshift.
So, using our example object above, we would want to see a table with 4 columns field1, field2, inner_field1, inner_field2 and our example object would result in 2 rows.
Assuming your table format is:
CREATE TABLE super_test (
field1 INTEGER,
field2 INTEGER,
arr SUPER
);
I ended up finding success with the "Copying a JSON document into multiple SUPER data columns" solution when using the json_paths from this page: https://docs.aws.amazon.com/redshift/latest/dg/ingest-super.html
In my case, I have a JSON sub-object rather than an 'arr' array element, but I would think the solution would be the same since both are valid JSON constructs.
My COPY options in Kinesis Firehose are similar to:
format as json 's3://<bucket-name>/schema/kinesis-schema.json'
The AWS examples do not have the as
in the format as json
above. Unclear if that as
is required. I know that it works for me with it there.
Here is the full COPY statement reported by Firehose:
COPY super_test FROM 's3://<bucket-name>/<manifest>' CREDENTIALS 'aws_iam_role=arn:aws:iam::<aws-account-id>:role/<role-name>' MANIFEST format as json 's3://<bucket-name>/schema/kinesis-schema.json';
where kinesis-schema.json
would have the following format based on your field names:
{
"jsonpaths": [
"$.field1",
"$.field2",
"$.arr"
]
}
This is what at least works for me. Hoping this at least helps you get pointed in the right direction.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.