简体   繁体   English

在Spark / Scala中将列从十进制转换为二进制字符串

[英]Converting a column from decimal to binary string in Spark / Scala

I have a Spark dataframe with decimal column. 我有一个带有十进制列的Spark数据框。 I want to convert this column to a binary string. 我想将此列转换为二进制字符串。 Are there any function for this can anybody help? 有任何功能可以有人帮忙吗?

Thank you! 谢谢!

There is a bin inbuilt function which states 有一个bin内置函数 ,它指出

An expression that returns the string representation of the binary value of the given long column. For example, bin("12") returns "1100".

So if you have a dataframe as 因此,如果您有一个数据框

+-----+
|Value|
+-----+
|4    |
+-----+

root
 |-- Value: decimal(10,0) (nullable = true)

You can use bin function as 您可以将bin函数用作

import org.apache.spark.sql.functions._
data.withColumn("Value_Binary", bin(col("Value")))

which should give you 这应该给你

+-----+------------+
|Value|Value_Binary|
+-----+------------+
|4    |100         |
+-----+------------+

root
 |-- Value: decimal(10,0) (nullable = true)
 |-- Binary_value: string (nullable = true)

I solved this issue with creating a user defined function. 我通过创建用户定义的函数解决了这个问题。

val toBinStr: Int => String = _.toBinaryString

import org.apache.spark.sql.functions.udf
val toBinStrUDF = udf(toBinStr)

// Apply the UDF to change the source dataset
data.withColumn("Value_Binary", toBinStrUDF($"Value")).show

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM