简体   繁体   中英

How do I install Apache spark and get it up and running with Kafka?

I am quite new to Hadoop and Apache Spark. I am a beginner trying my hands on it. Now, I am trying to try my hands on Apache Spark. In order to do that, I am assuming I have to install a software named Apache Spark on my machine.

I tried to create a local machine using VM but I am lost at this point. Is there any resource to help me configure and install Spark and Kafka in the same machine ?

You are in luck, Chris Fregley (from the IBM Spark TC) has a project which has docker images for all of these things working together (you can see it at https://github.com/fluxcapacitor/pipeline/wiki ). For a "real" production deployment, you might want to look at deploying Spark on YARN or something similar - its deployment options are explained at http://spark.apache.org/docs/latest/cluster-overview.html and integrating it with Kafka is covered in the special Kafka integration guide http://spark.apache.org/docs/latest/streaming-kafka-integration.html . Welcome to the wonderful of Spark I hope these help you get started :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM