[英]Using Spark UDF to pick integer / decimal part of signed float value from spark dataframe
My objective is to transform a spark DF with below schema我的目标是用以下模式转换火花 DF
--- value (float)
To a DF having two columns that store each the integer part and decimal part of the floating value This is my approach对于具有两列的 DF,每列存储 integer 部分和浮点值的小数部分 这是我的方法
def transform(df):
split_udf1 = udf(lambda x: self.split_numbers(x)[0], IntegerType())
split_udf2 = udf(lambda x: self.split_numbers(x)[1], IntegerType())
return df.select(split_udf1(df['value']).alias('value1'),split_udf2(df['value']).alias('value'))
def split_numbers(num):
num = str(num)
return [int(i) for i in num.split(".")]
But I Dont get any values in my transformed DF.但是我在转换后的 DF 中没有得到任何值。 What are the possible reasons?
可能的原因有哪些?
After debuggin I found out what is happening.调试后我发现发生了什么。 The code is working correctly.
代码工作正常。 However after returning the resultant DF, I was creating a view out of it to query later.
然而,在返回结果 DF 后,我正在创建一个视图以供稍后查询。
But the phase that I started to query the view was out of my spark context但是我开始查询视图的阶段超出了我的 spark 上下文
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.