简体   繁体   中英

Passing sparkSession as function parameters spark-scala

I'm in the process of generating tables using spark-scala and I am concerned about efficiency.

Would passing sparkSession make my program slower? Is it any slower than SparkSession.getOrCreate ?

I am using yarn as master.

Thanks in advance.

You can create Spark session once and pass around without losing any performance. However it is little inconvenient to modify method signature to pass in a session object. You can avoid that by simply calling getOrCreate in the functions to obtain the same global session without passing it. When getOrCreate is called it sets the current session as default SparkSession.setDefaultSession ad gives that back to you for other getOrCreat calls

    val spark : SparkSession = SparkSession.builder
      .appName("test")
      .master("local[2]")
      .getOrCreate()

    //pass in function
    function1(pass)
    
    //obtain without passing
    
    def function2(){
    val s = SparkSession.builder.getOrCreate()
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM