I am looking to replace all the values of a column in a spark dataframe with a particular value. I am using pyspark. I tried something like -
new_df = df.withColumn('column_name',10)
Here I want to replace all the values in the column column_name
to 10
. In pandas this could be done by df['column_name']=10
. I am unable to figure out how to do the same in Spark.
You can use a UDF to replace the value. However you can use currying to bring support to different values.
from pyspark.sql.functions import udf, col
def replacerUDF(value):
return udf(lambda x: value)
new_df = df.withColumnRenamed("newCol", replacerUDF(10)(col("column_name")))
It might be easier to use lit
as follows:
from pyspark.sql.functions import lit
new_df = df.withColumn('column_name', lit(10))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.