how to use round(col(),col()) in pyspark?

Question

I want to use ROUND function like this:

CAST(ROUND(XTD2.CpTechnologyCostAmt,ISNULL(CurrencyDecimalPlaceNum)) AS decimal(32,8))

in pyspark.

In Dataframe and SQL ROUND function takes first argument as col and second argument as int number but I want to pass second argument as another column.

If i am trying to use second argument as col it is giving error column is not callable .

Pyspark code:

round(
        col("XTD1.CpDirectCostAmt"),
        coalesce(col("CurrencyDecimalPlaceNum").cast(IntegerType()), lit(2)),
    ).cast(DecimalType(23, 6))

how to solve this issue?

Answer 1

The round() function takes a column and an int as arguments: doc . The problem is that you are passing 2 columns as arguments since the coalesce returns a column.

I'm not sure how to do it using coalesce, I would use UDF and create a function that rounds the number and then apply it on both columns like this:

from pyspark.sql import SparkSession
import pyspark.sql.functions as F


def round_value(value, scale):
    if scale is None:
        scale = 2
    return round(value, scale)


if __name__ == "__main__":
    spark = SparkSession.builder.master("local").appName("Test").getOrCreate()
    df = spark.createDataFrame(
        [
            (1, 1, 2.3445),
            (2, None, 168.454523),
            (3, 4, 3500.345354),
        ],
        ["id", "CurrencyDecimalPlaceNum", "float_col"],
    )
    round_udf = F.udf(lambda x, y: round_value(x, y))
    df = df.withColumn(
        "round",
        round_udf(
            F.col("float_col"),
            F.col("CurrencyDecimalPlaceNum"),
        ),
    )

Result:

+---+-----------------------+-----------+---------+
| id|CurrencyDecimalPlaceNum|  float_col|    round|
+---+-----------------------+-----------+---------+
|  1|                      1|     2.3445|      2.3|
|  2|                   null| 168.454523|   168.45|
|  3|                      4|3500.345354|3500.3454|
+---+-----------------------+-----------+---------+

how to use round(col(),col()) in pyspark?

Question

1 answers

solution1
0 2021-10-17 12:54:59

how to use round(col(),col()) in pyspark?

Question

1 answers

solution1 0 2021-10-17 12:54:59

solution1
0 2021-10-17 12:54:59