简体   繁体   中英

How to fill null values with a current_timestamp() in PySpark DataFrame?

I have a column called createdtime having few nulls. All I want it to fill those nulls with a current timestamp.

I have tried below piece of code where I assign the time manually, I want it to in such a way that whenever I run this piece of code it should pick up the current_timestamp()

from pyspark.sql.functions import *
default_time = '2022-06-28 05:07:29.077'
df = df.fillna({'createdtime': default_time})

I have tried below method but gives an error: TypeError: Column is not iterable.

from pyspark.sql.functions import *
default_time = current_timestamp()
df = df.fillna({'createdtime': default_time})

error screenshot: 在此处输入图像描述

The default_time variable needs to be quoted in quotes .

default_time = '2022-06-28 05:07:29.077'
df = df.fillna({'createdtime': f'{default_time}'})

Or use the coalesce function.

df = df.withColumn('createdtime', F.coalesce('createdtime', F.current_timestamp()))

Because fillna accepts a string and not column you can use below code

import datetime
df.fillna({"dt_service":str(datetime.datetime.utcnow())})

you can't pass current_timestamp() bacuase its variable , fillna accepts either int, float, double or string values.

you can use python library to pass current timestamp

Below is the working code

>>> df.show()
+---------+------+-----+----------+
|school_id|gender|class|       doj|
+---------+------+-----+----------+
|        1|     M|    9|01/01/2020|
|        1|     M|   10|01/03/2018|
|        1|     F|   10|01/04/2018|
|        2|     M|    9|      null|
|        2|     F|   10|      null|
+---------+------+-----+----------+

>>> from datetime import datetime
>>> now = datetime.now()
>>> dt_string = now.strftime("%d-%m-%Y %H:%M:%S")
>>> df.fillna(value=dt_string,subset=['doj']).show()
+---------+------+-----+-------------------+
|school_id|gender|class|                doj|
+---------+------+-----+-------------------+
|        1|     M|    9|         01/01/2020|
|        1|     M|   10|         01/03/2018|
|        1|     F|   10|         01/04/2018|
|        2|     M|    9|28-06-2022 13:22:10|
|        2|     F|   10|28-06-2022 13:22:10|
+---------+------+-----+-------------------+

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM