I am new to Scala. I am trying to unit test ASSERTIONS for UT/DQ check for Scala Spark Dataframe using ZIO library. Can anyone help me out here if they have already worked on ZIO library before.
I would recommend spark-fast-tests for making assertions about Spark DataFrames in scala. ZIO-test isn't one of the frameworks that spark-fast-tests has documented support for, but you should still be able to utilise it.
If you have some transformation on a DataFrame that you need to test:
import org.apache.spark.sql.functions.lit
import org.apache.spark.sql.DataFrame
object Transformations {
def appendLiteral(incomingData: DataFrame): DataFrame =
incomingData.withColumn("foo", lit("bar"))
}
A naive test, which doesn't leverage the wider ZIO effect ecosystem, might look like this:
import com.github.mrpowers.spark.fast.tests.DataFrameComparer
import org.apache.spark.sql.SparkSession
import zio.test._
import zio.test.Assertion._
object TransformationsSpec extends ZIOSpecDefault with DataFrameComparer {
val spark: SparkSession = SparkSession.builder().config("spark.master", "local").getOrCreate()
import spark.implicits._
def spec = suite("TransformationSpec")(
test("appendLiteral adds a column named 'foo' with value 'bar'") {
val testInput: DataFrame = Seq("Hello", "hi", "howdy").toDF("greeting")
val expected: DataFrame = Seq(("Hello", "bar"), ("hi", "bar"), ("howdy", "bar")).toDF("greeting", "foo")
val result = testInput.transform(Transformations.appendLiteral)
assert(assertSmallDataFrameEquality(expected, result, ignoreNullable = true))(isUnit)
}
)
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.