My objective is to transform a spark DF with below schema
--- value (float)
To a DF having two columns that store each the integer part and decimal part of the floating value This is my approach
def transform(df):
split_udf1 = udf(lambda x: self.split_numbers(x)[0], IntegerType())
split_udf2 = udf(lambda x: self.split_numbers(x)[1], IntegerType())
return df.select(split_udf1(df['value']).alias('value1'),split_udf2(df['value']).alias('value'))
def split_numbers(num):
num = str(num)
return [int(i) for i in num.split(".")]
But I Dont get any values in my transformed DF. What are the possible reasons?
After debuggin I found out what is happening. The code is working correctly. However after returning the resultant DF, I was creating a view out of it to query later.
But the phase that I started to query the view was out of my spark context
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.