[英]lazy val function vs def method
When calling to a function from external class, in case of many calls, what will give me a better performance, lazy val
function or def
method?从外部 class 调用 function 时,如果有很多调用,什么会给我更好的性能,
lazy val
function 或def
方法? So far, what I understood is:到目前为止,我的理解是:
def
method- def
方法-
lazy val
lambda expression - lazy val
lambda 表达式 -
So, it may seem that using lazy val will reduce the need to evaluate the function every time, should it be preferred?所以,似乎使用 lazy val 会减少每次评估 function 的需要,它应该是首选吗?
I faced that when i'm producing UDF for Spark code, and i'm trying to understand which approach is better.当我为 Spark 代码生成 UDF 时,我遇到了这个问题,我试图了解哪种方法更好。
object sql {
def emptyStringToNull(str: String): Option[String] = {
Option(str).getOrElse("").trim match {
case "" => None
case "[]" => None
case "null" => None
case _ => Some(str.trim)
}
}
def udfEmptyStringToNull: UserDefinedFunction = udf(emptyStringToNull _)
def repairColumn_method(dataFrame: DataFrame, colName: String): DataFrame = {
dataFrame.withColumn(colName, udfEmptyStringToNull(col(colName)))
}
lazy val repairColumn_fun: (DataFrame, String) => DataFrame = { (df,colName) =>
df.withColumn(colName, udfEmptyStringToNull(col(colName)))
}
}
There's no need for you to use a lazy val
in this specific case.在这种特定情况下,您无需使用
lazy val
。 When you assign a function to a lazy val
, its results are not memoized, as you seem to think they are.当您将 function 分配给
lazy val
时,它的结果不会像您认为的那样被记忆。 Since the function itself is a plain function literal and not the result of an expensive computation (regardless of what goes on inside it), making it lazy is not useful.由于 function 本身是一个普通的 function 字面量,而不是昂贵计算的结果(不管它内部发生了什么),所以让它变得懒惰是没有用的。 All it does is add overhead when accessing and calling it.
它所做的只是在访问和调用它时增加开销。 A simple
val
would be better, but making it a proper method would be best.一个简单的
val
会更好,但最好让它成为一个合适的方法。
If you want memoization, see Is there a generic way to memoize in Scala?如果你想要记忆,请参阅Scala 中是否有通用的记忆方法? instead.
反而。
Ignoring your specific example, if the def
in question didn't take any arguments and both it and the lazy val
were simple values that were expensive to compute, I would go with the lazy val
if you're going to call it many times to avoid computing it over and over again.忽略你的具体例子,如果有问题的
def
没有采用任何 arguments 并且它和lazy val
都是简单的值,计算起来很昂贵,如果你要多次调用它,我会用 go 和lazy val
避免一遍又一遍地计算它。
If they were values that were very cheap to compute and you're not going to call it many times, or if they're expensive to compute but you're only going to call them once, I would go with a def
instead.如果它们是计算成本非常低的值并且您不会多次调用它,或者如果它们的计算成本很高但您只打算调用它们一次,那么我会用
def
代替 go。 There wouldn't be much difference if you used a lazy val
instead, but it would avoid making a couple of fields.如果您改用
lazy val
,则不会有太大区别,但它会避免创建几个字段。
If they're somewhat cheap to compute but they're being called many times, it may be better to use a lazy val
simply because they'll be cached.如果它们的计算成本有点低,但它们被调用了很多次,那么使用
lazy val
可能会更好,因为它们会被缓存。 However, you might want to look at your overall design before looking at such micro-optimizations.但是,在查看此类微优化之前,您可能希望查看整体设计。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.