简体   繁体   中英

Apache Spark concurrent program example

I want the following simple hello world program to be executed 100 times in parallel in Apache Spark.

  public class SimpleHelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello World");
    }
  }

So after executing in parallel, it should print "Hello World" 100 times.

How can I do this in standalone Apache Spark?

Depends on what you really want:

  1. Multi-Threading in Spark-Driver: eg
import scala.collection.parallel._
    import scala.concurrent.forkjoin._
    val pool = (0 to 100).par
    // ThreadPool with 100 concurrent Threads
    pool.tasksupport = new ForkJoinTaskSupport(new ForkJoinPool(100))
    pool.foreach(i => {
        println("Hello World")
    })
  1. "Multi-Threading" per Spark-Executor Task: eg
// create 100 partitions
    var df = sc.parallelize(1 to 100, 100).toDF()
    // print "hello world" per each partition
    df.foreachPartition(_ => println("Hello World"))

This will do what you want in Scala with Spark 2.x:

sparkSession.range(100)
.foreach(_ => println("Hello World"))

But you won't see the printed lines on the driver because they are executed on the worker nodes.

Hi if you want running spark machine for this case.

For Spark job that you need the first initiation your RDD. then use the Spark action or transformation functions for data computation. Also, spark it running parallel automatically.

   public class hello world {

        public static void main(String[] args) throws Exception {

                try (JavaSparkContext sc = setupSparkContext()) {

            JavaRDD<String> helloworldRDD = sc.textFile("//your hellworld file");
                helloworldRDD.map(x->{
                    for (int i=0;i<100;i++){

                        System.out.println(x);

                    }
                    return x;

                }).collect();
        }
        }

        private static JavaSparkContext setupSparkContext() {

            SparkConf sc = new SparkConf();

            return App.getSparkContext("helloworld", sc);
        }
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM