I want to use ROUND
function like this:
CAST(ROUND(XTD2.CpTechnologyCostAmt,ISNULL(CurrencyDecimalPlaceNum)) AS decimal(32,8))
in pyspark.
In Dataframe and SQL ROUND
function takes first argument as col
and second argument as int
number but I want to pass second argument as another column.
If i am trying to use second argument as col it is giving error column is not callable
.
Pyspark code:
round(
col("XTD1.CpDirectCostAmt"),
coalesce(col("CurrencyDecimalPlaceNum").cast(IntegerType()), lit(2)),
).cast(DecimalType(23, 6))
how to solve this issue?
The round()
function takes a column and an int as arguments: doc . The problem is that you are passing 2 columns as arguments since the coalesce
returns a column.
I'm not sure how to do it using coalesce, I would use UDF and create a function that rounds the number and then apply it on both columns like this:
from pyspark.sql import SparkSession
import pyspark.sql.functions as F
def round_value(value, scale):
if scale is None:
scale = 2
return round(value, scale)
if __name__ == "__main__":
spark = SparkSession.builder.master("local").appName("Test").getOrCreate()
df = spark.createDataFrame(
[
(1, 1, 2.3445),
(2, None, 168.454523),
(3, 4, 3500.345354),
],
["id", "CurrencyDecimalPlaceNum", "float_col"],
)
round_udf = F.udf(lambda x, y: round_value(x, y))
df = df.withColumn(
"round",
round_udf(
F.col("float_col"),
F.col("CurrencyDecimalPlaceNum"),
),
)
Result:
+---+-----------------------+-----------+---------+
| id|CurrencyDecimalPlaceNum| float_col| round|
+---+-----------------------+-----------+---------+
| 1| 1| 2.3445| 2.3|
| 2| null| 168.454523| 168.45|
| 3| 4|3500.345354|3500.3454|
+---+-----------------------+-----------+---------+
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.