[英]how to truncate values for multiple rows and columns in dataframe in spark scala
我有一个数据帧 df id ABCD 1 1.000234 2.3456 4.6789 7.6934 2 3.7643 4.2323 5.6342 8.567
我想创建另一个数据帧 df1,并将值截断到小数点后 2 位
id A B C D
1 1.00 2.35 4.68 7.70
2 3.76 4.23 5.63 8.57
有人可以帮助我编写代码,因为我的数据框由 70 列和 10000 行组成
这可以使用format_number
函数很容易地完成
val df = Seq(
(1, 1.000234, 2.3456, 4.6789, 7.6934),
(2, 3.7643, 4.2323, 5.6342, 8.567)
).toDF("id", "A", "B", "C", "D")
df.show()
+---+--------+------+------+------+
| id| A| B| C| D|
+---+--------+------+------+------+
| 1|1.000234|2.3456|4.6789|7.6934|
| 2| 3.7643|4.2323|5.6342| 8.567|
+---+--------+------+------+------+
val df1 = df.select(col("id"),
format_number(col("A"), 2).as("A"),
format_number(col("B"), 2).as("B"),
format_number(col("C"), 2).as("C"),
format_number(col("D"), 2).as("D"))
df1.show()
+---+----+----+----+----+
| id| A| B| C| D|
+---+----+----+----+----+
| 1|1.00|2.35|4.68|7.69|
| 2|3.76|4.23|5.63|8.57|
+---+----+----+----+----+
这是动态截断数据帧中的值的方法之一,而不是硬核方法
import org.apache.spark.sql.functions.round
val df1 = df.columns.foldLeft(df){(df,colName) =>df.withColumn(colName,round(col(colName),3))}
这对我有用
您可以通过导入 org.apache.spark.sql.types._ 来使用 DecimalType(3,2) 进行转换
scala> val df = Seq(
| (1, 1.000234, 2.3456, 4.6789, 7.6934),
| (2, 3.7643, 4.2323, 5.6342, 8.567)
| ).toDF("id", "A", "B", "C", "D")
df: org.apache.spark.sql.DataFrame = [id: int, A: double ... 3 more fields]
scala> df.show()
+---+--------+------+------+------+
| id| A| B| C| D|
+---+--------+------+------+------+
| 1|1.000234|2.3456|4.6789|7.6934|
| 2| 3.7643|4.2323|5.6342| 8.567|
+---+--------+------+------+------+
scala> import org.apache.spark.sql.types._
import org.apache.spark.sql.types._
scala> val df2=df.columns.filter(_ !="id").foldLeft(df){ (acc,x) => acc.withColumn(x,col(x).cast(DecimalType(3,2))) }
df2: org.apache.spark.sql.DataFrame = [id: int, A: decimal(3,2) ... 3 more fields]
scala> df2.show(false)
+---+----+----+----+----+
|id |A |B |C |D |
+---+----+----+----+----+
|1 |1.00|2.35|4.68|7.69|
|2 |3.76|4.23|5.63|8.57|
+---+----+----+----+----+
scala>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.