简体   繁体   English

将 JSON 数据的 CSV 文件从 S3 上传到 Redshift

[英]Upload CSVs of JSON data from S3 To Redshift

I have thousands of unusually formatted CSVs sitting in S3 that I need uploaded to Redshift.我有数千个格式异常的 CSV 文件位于 S3 中,我需要将它们上传到 Redshift。

The CSVs are formatted like so: CSV 的格式如下:

 Column A            Column B            ..... Column Z
{"id": 2034823"   "created": "2017-1-1"       "result": true} 

In other words, each row of the CSV is valid JSON.换句话说,CSV 的每一行都是有效的 JSON。

I've tried a simple copy command, but to no avail.我尝试了一个简单的复制命令,但无济于事。 I tried to add the format as json 'auto';我尝试将format as json 'auto';添加format as json 'auto'; flag, but still receiving errors:标志,但仍然收到错误:

Invalid Value: err_code 1216, line number 1, position 0

Is there a recommended way to handle CSVs in this format?是否有推荐的方法来处理这种格式的 CSV? I want to save them into an existing Redshift table that already has types defined我想将它们保存到已经定义了类型的现有 Redshift 表中

I have the same exact types of files.我有完全相同的文件类型。 The steps I have followed to load them into a Redshift table like this我遵循的步骤将它们加载到这样的 Redshift 表中

  1. Create an external table in Redshift Spectrum table with struct使用struct在 Redshift Spectrum 表中创建外部表
  2. Insert into your Redshift table from the table above.从上表插入到您的 Redshift 表中。

in your case在你的情况下

1. 
CREATE EXTERNAL TABLE <spectrum schema>.<your external table>
(
data struct<
id:integer,
created:timestamp,
...
result:varchar(5)>
)
row format serde 'org.openx.data.jsonserde.JsonSerDe'
with serdeproperties (
'dots.in.keys' = 'true',
'mapping.requesttime' = 'requesttimestamp')
 as location 's3:<your S3 bucket>';

2.
INSERT INTO <your Redshift table> 
SELECT data.id, data.created, ..., data.result
  FROM <your external table>

See how to setup Redshift Spectrum https://docs.aws.amazon.com/redshift/latest/dg/c-getting-started-using-spectrum.html查看如何设置 Redshift Spectrum https://docs.aws.amazon.com/redshift/latest/dg/c-getting-started-using-spectrum.html

Let me know if you have further questions.如果您还有其他问题,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM