为 AWS EKS 配置 Apache Spark

Question

I'd like to start working with apache spark on k8s but I don't have experience with it.我想开始在 k8s 上使用 apache spark，但我没有这方面的经验。 I installed Spark via Helm chart with ServiceType "LoadBalancer".我通过服务类型为“LoadBalancer”的 Helm chart 安装了 Spark。

spark-submit --master 'spark://LOADBALANCER.elb.eu-central-1.amazonaws.com:7077' \ 
--deploy-mode client \
--conf spark.kubernetes.container.image='MY_IMAGE' test.py

This is my test code test.py这是我的测试代码test.py

from pyspark.sql import SparkSession

spark_session = SparkSession.builder \
    .getOrCreate()
l = [('Alice', 1)]
spark_session.createDataFrame(l).show()

Running locally on microk8s cluster works but the same way on AWS EKS cluster fails with following endless log warning在 microk8s 集群上本地运行有效，但在 AWS EKS 集群上以相同的方式运行失败，并出现无休止的日志警告

22/02/16 17:36:01 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks resource profile 0
22/02/16 17:36:16 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Is there a way to develop the user code and run it against the kube.netes cluster or should I create a new docker image everytime?有没有办法开发用户代码并针对 kube.netes 集群运行它，或者我应该每次都创建一个新的 docker 图像？ Maybe there are some best practices for Apache Spark on EKS?也许 EKS 上的 Apache Spark 有一些最佳实践？

Answer 1

Try to change to --deploy-mode cluster尝试更改为--deploy-mode cluster

为 AWS EKS 配置 Apache Spark

问题描述

1 个解决方案

解决方案1
0 2022-02-19 04:47:40

为 AWS EKS 配置 Apache Spark

问题描述

1 个解决方案

解决方案1 0 2022-02-19 04:47:40

解决方案1
0 2022-02-19 04:47:40