简体   繁体   中英

Using Spark UDF to pick integer / decimal part of signed float value from spark dataframe

My objective is to transform a spark DF with below schema

--- value (float)

To a DF having two columns that store each the integer part and decimal part of the floating value This is my approach

def transform(df):
        split_udf1 = udf(lambda x: self.split_numbers(x)[0], IntegerType())
        split_udf2 = udf(lambda x: self.split_numbers(x)[1], IntegerType())
        return df.select(split_udf1(df['value']).alias('value1'),split_udf2(df['value']).alias('value'))

def split_numbers(num):
    num = str(num)
    return [int(i) for i in num.split(".")]

But I Dont get any values in my transformed DF. What are the possible reasons?

After debuggin I found out what is happening. The code is working correctly. However after returning the resultant DF, I was creating a view out of it to query later.

But the phase that I started to query the view was out of my spark context

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM