简体   繁体   中英

Understanding the scala syntax about high order function

I am new to scala , I need to understand what is going on with below code snippet more specifically the sampleFunc val:

val sampleFunc: Seq[Row] => (Int, Long, Boolean, Row, String) = (mem: Seq[Row]) => {
                                //some code
                      (a1,b1,c1,d1,e1) // returning the value

spark.udf.register("sampleUDF", udf(sampleFunc,
      StructField(a, IntegerType),
      StructField(b, LongType),
      StructField(c, BooleanType),
      StructField(d, StructType(schema.fields)),
      StructField(e, StringType)


Well, I see that in code snippet used Spark , but let's omit this and just take a look into sampleFunc . So everything quite simple: Next constitutions declares function itself:

val sampleFunc: Seq[Row] => (Int, Long, Boolean, Row, String) = ...

where Seq[Row] function argument type and (Int, Long, Boolean, Row, String) function result. In another words you create variable of type Function1[Seq[Row], (Int, Long, Boolean, Row, String)]

Then goes function body or implementation if you will

... = (mem: Seq[Row]) => {
                                //some code
                      (a1,b1,c1,d1,e1) // returning the value

where mem is the variable of declared function argument type, which should be the same type or extend the type used in function declaration type. (Function arguments are co-variant. Please, see for more example another good SO post: Why is Function[-A1,...,+B] not about allowing any supertypes as parameters? )

=> statement says that after it goes function body itself.

If you have more Java background or any another imperative language background, this also can be implemented in method manner:

def sampleFunc(mem: Seq[Row]): (Int, Long, Boolean, Row, String) =  {
  //some code
  (a1,b1,c1,d1,e1) // returning the value

Hope this helps!

//<-value name-> <-------------- value type-------------------->   <--------------implementation ----------------------->
//              <-arg type-> <-----result type --------------->   <-function argument->   <----func implementation ---->
val  sampleFunc:  Seq[Row]  => (Int, Long, Boolean, Row, String) = (mem: Seq[Row])      => { /*...*/; (a1,b1,c1,d1,e1) }

//same written differently:
//<-value name-> <-------------- value type------------------------------>   <-------implementation ----------->
val sampleFunc: Funtion1[Seq[Row], Tuple5[Int,Long, Boolean, Row, String]] = {mem => /*...*/; (a1,b1,c1,d1,e1)}
  • value name: nothing special here. Just another val in your code.
  • value type: it is long but pretty straightforward. It is Function1 type that takes Seq[Row] and returns Tuple5[Int, Long, Boolean, Row, String] . This is just scala's nicer syntax for it.
  • implementation: We are creating function that takes Seq[Row] using => syntax. Also nothing special here.

Maybe its easier for you to understand if you desuger the Tuple5 factory method invocations:

val sampleFunc: Seq[Row] => Tuple5[Int, Long, Boolean, Row, String] = 
    (mem: Seq[Row]) => Tuple5(a1,b1,c1,d1,e1)

and if you go further and replace the => in the type with Function1 you get:

Function1[Seq[Row], Tuple5[Int, Long, Boolean, Row, String]]

which means that sampleFunc is a function that takes an argument of type Seq[Row] and returns a Tuple5[Int, Long, Boolean, Row, String]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM