[英]How to pass the parameter to User-Defined Function?
I have a user-defined function: 我有一个用户定义的函数:
calc = udf(calculate, FloatType())
param1 = "A"
result = df.withColumn('col1', calc(col('type'), col('pos'))).groupBy('pk').sum('events')
def calculate(type, pos):
if param1=="A":
a, b = [ 0.05, -0.06 ]
else:
a, b = [ 0.15, -0.16 ]
return a * math.pow(type, b) * max(pos, 1)
I need to pass a parameter param1
to this udf
. 我需要将参数
param1
传递给此udf
。 How can I do it? 我该怎么做?
You can use lit
or typedLit
as a parameter for your udf
like this: 您可以使用
lit
或typedLit
为您的参数udf
这样的:
In Python: 在Python中:
from pyspark.sql.functions import udf, col, lit
mult = udf(lambda value, multiplier: value * multiplier)
df = spark.sparkContext.parallelize([(1,),(2,),(3,)]).toDF()
df.select(mult(col("_1"), lit(3)))
In Scala: 在斯卡拉:
import org.apache.spark.sql.functions.{udf, col, lit}
val mult = udf((value: Double, multiplier: Double) => value * multiplier)
val df = sparkContext.parallelize((1 to 10)).toDF
df.select(mult(col("value"), lit(3)))
If you insist, 如果你坚持,
def calculate(param1):
return param1 * param1
sqlContext.udf.register("square", calculate)
I am not sure this will work or not...but can you try these ? 我不确定这是否会起作用......但你可以尝试这些吗?
calc = udf(calculate, FloatType())
param1 = "A"
#If not !A Give some dummy value other than A
result = df.withColumn('col1', calc(col('type'), col('pos'),param1)).groupBy('pk').sum('events')
def calculate(type, pos,param1):
if param1=="A":
a, b = [ 0.05, -0.06 ]
else:
a, b = [ 0.15, -0.16 ]
return a * math.pow(type, b) * max(pos, 1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.