繁体 English 中英

复制并从S3解压缩到HDFS

[英]Copy and unzip from S3 to HDFS

原文 2016-07-20 06:13:46 5 1 json/ apache-spark/ amazon-s3/ zip/ gzip

我在S3上有一些大的zip文件。 这些zip文件中的每一个都包含几个gz文件，其中包含JSON格式的数据。 我需要（i）将gz文件复制到HDFS和（ii）优选地通过Apache Spark / Impala / Hive处理文件。 最简单/最好的方法是什么？

1 个解决方案

1）尝试使用distcp将文件从s3复制到HDFS

2）对于处理，使用“ org.apache.spark.sql.hive.HiveContext ”的read.json从HDFS读取JSON数据并创建数据帧。 然后对它进行任何操作。

请点击此链接， http：//spark.apache.org/docs/latest/sql-programming-guide.html#creating-dataframes

使用复制命令将数据从s3加载到redshift

[英]Loading data from s3 to redshift using copy command

将 json.gz 的命令从 s3 复制到 redshift

[英]copy command for json.gz from s3 to redshift

将具有多个值的JSON从S3复制到Redshift

[英]Copy JSON with multiple values from S3 to Redshift

Redshift / S3-将Redshift表的内容作为JSON复制到S3吗？

[英]Redshift/S3 - Copy the contents of a Redshift table to S3 as JSON?

使用python boto将json文件从我的本地计算机复制到Amazon S3

[英]using python boto to copy json file from my local machine to amazon S3

使用 CLI 将 JSON 文件从 windows 系统中的 AWS S3 存储桶复制到本地

[英]Copy JSON files to local from AWS S3 bucket in windows system using CLI

从S3返回JSON

[英]Return JSON from S3

如何从S3存储/重新作为JSON

[英]How To Store/Retrive from S3 As JSON

从 S3 存储桶解析 JSON 文件

[英]Parsing a JSON file from a S3 Bucket

雪花从 s3 json 读取 null

[英]snowflake read null from s3 json

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用复制命令将数据从s3加载到redshift 将 json.gz 的命令从 s3 复制到 redshift 将具有多个值的JSON从S3复制到Redshift Redshift / S3-将Redshift表的内容作为JSON复制到S3吗？使用python boto将json文件从我的本地计算机复制到Amazon S3 使用 CLI 将 JSON 文件从 windows 系统中的 AWS S3 存储桶复制到本地从S3返回JSON 如何从S3存储/重新作为JSON 从 S3 存储桶解析 JSON 文件雪花从 s3 json 读取 null

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM