简体繁体 English

AWS 中带有 HDFS 或 S3 的数据湖有什么区别？

[英]What is the difference between a data lake with HDFS or S3 in AWS?

原文 2019-07-11 08:31:27 6 1 amazon-s3/ amazon-ec2/ hdfs/ data-lake

I need to build a data lake on AWS, but I don't know how exactly S3 is different from HDFS.我需要在 AWS 上构建一个数据湖，但我不知道 S3 与 HDFS 究竟有何不同。 I found some answers in the Internet but I still don't understand the real difference.我在互联网上找到了一些答案，但我仍然不明白真正的区别。

I also need to know if someone has the data lake architecture of HDFS and S3 in AWS.我还需要知道是否有人在AWS中拥有HDFS和S3的数据湖架构。

1 个解决方案

HDFS is only accessible to the Hadoop cluster in which it exists. HDFS只能被它所在的 Hadoop 集群访问。 If the cluster turns off or is terminated, the data in HDFS will be gone.如果集群关闭或终止，HDFS 中的数据将消失。

Data in Amazon S3: Amazon S3 中的数据：

Remains available at all times (it cannot be 'turned off')始终可用（不能“关闭”）
Is accessible to multiple clusters可被多个集群访问
Is accessible to other AWS services , such as Amazon Athena (which is 'Presto as a service', so you might not even need a Hadoop cluster)可以访问其他 AWS 服务，例如 Amazon Athena（这是“Presto 即服务”，因此您甚至可能不需要 Hadoop 集群）
Has multiple storage classes , such as storing less-frequently accessed data at a lower cost具有多个存储类别，例如以较低的成本存储访问频率较低的数据
Does not have storage limits (while HDFS is limited to the storage available in the Hadoop cluster)没有存储限制（而 HDFS 仅限于 Hadoop 集群中可用的存储）

AWS EFS 和 S3 之间有什么区别？ - What is difference between AWS EFS and S3?

数据湖 AWS 无服务器 Amazon S3 - Data Lake AWS serverless Amazon S3

AWS S3 Select 和 AWS Athena 有什么区别？ - What is difference between AWS S3 Select and AWS Athena?

AWS S3 中的 Object 和 Object ACL 有什么区别？ - what is the difference between Object and Object ACL in AWS S3?

AWS S3数据湖跨账户使用 - AWS S3 data lake cross account usage

使用Sqoop将数据从AWS S3导入到HDFS - Import data to Hdfs from AWS S3 using Sqoop

如何使用Hadoop MapReduce将数据从AWS S3导入HDFS - How to import data from aws s3 to HDFS with Hadoop MapReduce

S3 AWS CDK 的 access_control 和 public_read_access 有什么区别？ - What's the difference between access_control and public_read_access for S3 AWS CDK?

AWS S3 SDK ListObjectsV2 startafter 和 ContinuationToken 有什么区别？ - AWS S3 SDK ListObjectsV2 what's difference between startafter and ContinuationToken?

s3 download和getobject有什么区别 - what is the difference between s3 download and getobject

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 AWS EFS 和 S3 之间有什么区别？ - What is difference between AWS EFS and S3? 数据湖 AWS 无服务器 Amazon S3 - Data Lake AWS serverless Amazon S3 AWS S3 Select 和 AWS Athena 有什么区别？ - What is difference between AWS S3 Select and AWS Athena? AWS S3 中的 Object 和 Object ACL 有什么区别？ - what is the difference between Object and Object ACL in AWS S3? AWS S3数据湖跨账户使用 - AWS S3 data lake cross account usage 使用Sqoop将数据从AWS S3导入到HDFS - Import data to Hdfs from AWS S3 using Sqoop 如何使用Hadoop MapReduce将数据从AWS S3导入HDFS - How to import data from aws s3 to HDFS with Hadoop MapReduce S3 AWS CDK 的 access_control 和 public_read_access 有什么区别？ - What's the difference between access_control and public_read_access for S3 AWS CDK? AWS S3 SDK ListObjectsV2 startafter 和 ContinuationToken 有什么区别？ - AWS S3 SDK ListObjectsV2 what's difference between startafter and ContinuationToken? s3 download和getobject有什么区别 - what is the difference between s3 download and getobject

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM