[英]Loading JSON from S3 to Redshift
I have the following JSON data in an S3 bucket:我在 S3 存储桶中有以下 JSON 数据:
{
"campaigns": [
{"campaign_reach": 123456,
"campaign_spend": 123456.0,
"campaign_goal": 12345678,
"id": "cda05a432b3b44c18c009a4a961f644a",
"campaign_name": "Campaign1",
"publisher_name": "PublisherA",
"campaign_impressions": 123456}],
"line_items": [],
"podcasts": [
{"podcast_name": "PodcastA", "id": "86edbca2dc644ba8960c8f4bd55bdc19"},
{"podcast_name": "PodcastB", "id": "fc3f2dc4c20949edaaf2186613ec7e47"}]
}
I am using COPY to load the "campaigns" portion to a table in Redshift.我正在使用 COPY 将“活动”部分加载到 Redshift 中的表中。
I have tried loading using jsonpaths我尝试使用 jsonpaths 加载
query_copy = """copy myschema.campaigns
from 's3://mybucket/mapping.json'
credentials 'aws_access_key_id=""" + acc + """;aws_secret_access_key=""" + sh + """'
json 's3://mybucket/campaign_jsonpaths.json'
;"""
My jsonpaths file "campaign_jsonpaths.json":我的 jsonpaths 文件“campaign_jsonpaths.json”:
{
"jsonpaths": [
"$['id']",
"$['campaign_name']",
"$['campaign_reach'][0]",
"$['campaign_spend']",
"$['campaign_goal']",
"$['campaign_impressions']",
"$['publisher_name']",
]
}
I have also tried using json 'auto':我也尝试过使用 json 'auto':
query_copy = """copy myschema.campaigns
from 's3://mybucket/mapping.json'
credentials 'aws_access_key_id=""" + acc + """;aws_secret_access_key=""" + sh + """'
json 'auto’
;"""
both result in successful runs, but the table in Redshift is empty.两者都导致成功运行,但 Redshift 中的表是空的。 No errors in stl_load_errors.
stl_load_errors 中没有错误。
I found a similar posting here, but no answers were provided: Redshift: copy command Json data from s3我在这里找到了类似的帖子,但没有提供答案: Redshift: copy command Json data from s3
Any help would be much appreciated.任何帮助将非常感激。
I was able to load the table successfully by doing the following:通过执行以下操作,我能够成功加载表:
Created campaigns table based on your JSON data:根据您的 JSON 数据创建活动表:
create table campaigns ( id varchar(100), campaign_name varchar(100), campaign_reach int, campaign_spend float, campaign_goal int, campaign_impressions int, publisher_name varchar(100) );
Created a mapping.json file with your JSON data使用您的 JSON 数据创建了一个 mapping.json 文件
Created a campaigns_jsonpaths.json as follows:创建了一个campaigns_jsonpaths.json,如下所示:
{ "jsonpaths": [ "$['campaigns'][0]['id']", "$['campaigns'][0]['campaign_name']", "$['campaigns'][0]['campaign_reach']", "$['campaigns'][0]['campaign_spend']", "$['campaigns'][0]['campaign_goal']", "$['campaigns'][0]['campaign_impressions']", "$['campaigns'][0]['publisher_name']" ] }
Ran copy:跑副本:
copy campaigns from 's3://<bucket>/mapping.json' iam_role 'arn:aws:iam::1234567890:role/Redshift-Role' json 's3://<bucket>/campaigns_jsonpaths.json';
Records were loaded successfully in the campaigns table.记录已成功加载到活动表中。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.