简体   繁体   中英

Pyspark - update column on a condition with a value from it's current row

I am trying to update a column based on a condition. If the condition passes, it should update it with a string + the current row's other column.

updated_df = original_df
    .withColumn(
        "url", F.when(original_df.id == 13, "something/{}".format(?) -> I want the current row's 'name' column value here.
    )
    .otherwise(original_df.url)
)

Is this a right approach?

You can use format_string method from pyspark.sql.functions: doc


updated_df = original_df
    .withColumn(
        "url", F.when(original_df.id == 13, F.format_string("something/%s", original_df.col_name)
    )
    .otherwise(original_df.url)
)


The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM