简体   繁体   中英

Data Locality in Spark on Kubernetes

Do we need HDFS or S3 when running Spark on Kubernetes? Will data locality be that efficient if we use just the nfs storage type?
Or maybe there is something fundamentally wrong in my understanding of Spark on Kubernetes.

It depends. If you are working externally with data(HDFS/S3). Then you won't have data locality and performance won't be awesome.

You can run hdfs inside Kubernetes . To try and avoid this issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM