[英]What is the difference between a data lake with HDFS or S3 in AWS?
I need to build a data lake on AWS, but I don't know how exactly S3 is different from HDFS.我需要在 AWS 上构建一个数据湖,但我不知道 S3 与 HDFS 究竟有何不同。 I found some answers in the Internet but I still don't understand the real difference.
我在互联网上找到了一些答案,但我仍然不明白真正的区别。
I also need to know if someone has the data lake architecture of HDFS and S3 in AWS.我还需要知道是否有人在AWS中拥有HDFS和S3的数据湖架构。
HDFS is only accessible to the Hadoop cluster in which it exists. HDFS只能被它所在的 Hadoop 集群访问。 If the cluster turns off or is terminated, the data in HDFS will be gone.
如果集群关闭或终止,HDFS 中的数据将消失。
Data in Amazon S3: Amazon S3 中的数据:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.