简体   繁体   English

如何连接到在Docker实例中运行的Spark

[英]How to connect to spark running within a docker instance

I'm trying to stand up Spark within a docker instance, then connect to it from an external python process. 我正在尝试在docker实例中站起来Spark,然后从外部python进程连接到它。

Context: this setup is important for CI/CD of Spark-based code in Travis. 上下文:此设置对于Travis中基于Spark的代码的CI / CD非常重要。 I'm also hoping to use it to establish a consistent dev environment for a distributed team. 我也希望使用它为分布式团队建立一致的开发环境。

How do I do this? 我该怎么做呢?

This docker image has been lovely for spinning up spark: https://hub.docker.com/r/jupyter/pyspark-notebook/ 这个docker镜像非常适合用来产生火花: https : //hub.docker.com/r/jupyter/pyspark-notebook/

Connecting via the dockerized notebook worked right out of the box. 通过dockerized笔记本进行连接可以立即使用。 (Aside from debugging, I'm not actually using notebooks, so I might remove them later. For now, they're a good debugging tool.) (除了调试之外,我实际上并没有使用笔记本,因此以后可能会删除它们。目前,它们是一个很好的调试工具。)

I haven't been able to connect from an external python process (notebook or otherwise.) Is there an environment variable that I need to set when I start python or instantiate my SparkContext? 我无法从外部python进程(笔记本或其他方式)进行连接。启动python或实例化SparkContext时是否需要设置环境变量?

Did you expose the spark ports correctly? 您是否正确暴露了火花口? Looking at the link you shared ( https://hub.docker.com/r/jupyter/pyspark-notebook/ ) I cannot make out how your are starting the containers. 查看您共享的链接( https://hub.docker.com/r/jupyter/pyspark-notebook/ ),我无法确定您是如何启动容器的。 You need to expose the spark master port to the host and then use it from your python code. 您需要将spark主端口公开给主机,然后从python代码中使用它。 Can you share the command you are using to start the containers (or your docker-compose.yml). 您能否共享用于启动容器的命令(或docker-compose.yml)。 Also share the url you are using from python code. 同时分享您从python代码中使用的网址。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 无法从Docker容器中运行的Django应用连接到远程PostgreSQL实例 - Cannot connect to remote PostgreSQL instance from Django app running within Docker container 如何检查docker实例是否正在运行? - How to check if a docker instance is running? 如何从 Python 连接到正在运行的 Outlook 实例 - How to connect to a running instance of Outlook from Python 如何在同一个 docker-compose.yaml 内连接 2 个 docker 组件 - How to connect 2 docker compontens within the same docker-compose.yaml 如何通过运行带有URL的docker容器来连接docker容器中的python应用 - how to connect python app in docker container with running docker container with url 如何从docker中的python连接到远程Spark集群 - How to connect to remote Spark cluster from python in docker 如何从同一台机器上的另一个 docker 实例连接到 mySQL docker 实例 - How to connect to mySQL docker instance from another docker instance on the same machine 如何连接我用 docker compose 创建的数据库实例? - How can I connect the database instance I created with docker compose? python 连接到 postgres docker 外部 docker 实例 - python connect to postgres docker instance outside docker 如何通过CLI连接到在Docker容器中运行的Redis - How to connect to Redis running in Docker container via CLI
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM