简体   繁体   English

如何配置连接到AWS EMR spark集群的Java客户端

[英]How to configure Java client connecting to AWS EMR spark cluster

I'm trying to write a simple spark application, and when i run it locally it works with setting the master as 我正在尝试编写一个简单的spark应用程序,当我在本地运行它时,它可以将master设置为

.master("local[2]")

But after configuring spark cluster on AWS (EMR) i can't connet to the master url: 但在AWS(EMR)上配置spark群集后,我无法连接到主URL:

.master("spark://<master url>:7077")

Is this the way to do it? 这是这样做的吗? am i missing something here? 我在这里错过了什么吗? The cluster is up and running, and when i tried adding my application as a step jar, so it will run directly in the cluster it worked. 群集已启动并正在运行,当我尝试将我的应用程序添加为步骤jar时,它将直接在其工作的群集中运行。 But i want to be able to run it from a remote machine. 但我希望能够从远程计算机上运行它。

would appreciate some help here, Thanks 非常感谢这里的一些帮助,谢谢

To run from a remote machine, you will need to open the appropriate ports in the Security Group assigned to your EMR master node. 要从远程计算机运行,您需要在分配给EMR主节点的安全组中打开相应的端口。 You will need to add at least 7077. 您需要添加至少7077。

If by "remote" you mean one that isn't in your AWS environment, you will also need to setup a way to route traffic to it from the outside. 如果通过“远程”表示不在AWS环境中的那个,则还需要设置一种从外部路由流量的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM