简体   繁体   English

如何在 Spark 2.0+ 中编写单元测试?

[英]How to write unit tests in Spark 2.0+?

I've been trying to find a reasonable way to test SparkSession with the JUnit testing framework.我一直在尝试寻找一种合理的方法来使用 JUnit 测试框架测试SparkSession While there seem to be good examples for SparkContext , I couldn't figure out how to get a corresponding example working for SparkSession , even though it is used in several places internally in spark-testing-base .虽然SparkContext似乎有很好的示例,但我无法弄清楚如何获得适用于SparkSession的相应示例,即使它在spark-testing-base内部的多个地方使用。 I'd be happy to try a solution that doesn't use spark-testing-base as well if it isn't really the right way to go here.我很乐意尝试一个不使用 spark-testing-base 的解决方案,如果它不是真正的正确方法。

Simple test case ( complete MWE project with build.sbt ):简单的测试用例( 完整的 MWE 项目,带有build.sbt ):

import com.holdenkarau.spark.testing.DataFrameSuiteBase
import org.junit.Test
import org.scalatest.FunSuite

import org.apache.spark.sql.SparkSession


class SessionTest extends FunSuite with DataFrameSuiteBase {

  implicit val sparkImpl: SparkSession = spark

  @Test
  def simpleLookupTest {

    val homeDir = System.getProperty("user.home")
    val training = spark.read.format("libsvm")
      .load(s"$homeDir\\Documents\\GitHub\\sample_linear_regression_data.txt")
    println("completed simple lookup test")
  }

}

The result of running this with JUnit is an NPE at the load line:使用 JUnit 运行它的结果是在负载线上出现 NPE:

java.lang.NullPointerException
    at SessionTest.simpleLookupTest(SessionTest.scala:16)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
    at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)

Note it shouldn't matter that the file being loaded exists or not;请注意,正在加载的文件是否存在并不重要; in a properly configured SparkSession, a more sensible error will be thrown .在正确配置的 SparkSession 中,将抛出更明智的错误

Thank you for putting this outstanding question out there.感谢您提出这个悬而未决的问题。 For some reason, when it comes to Spark, everyone gets so caught up in the analytics that they forget about the great software engineering practices that emerged the last 15 years or so.出于某种原因,谈到 Spark,每个人都沉迷于分析,以至于忘记了过去 15 年左右出现的伟大的软件工程实践。 This is why we make it a point to discuss testing and continuous integration (among other things like DevOps) in our course.这就是为什么我们在我们的课程中重点讨论测试和持续集成(以及 DevOps 等其他内容)。

A Quick Aside on Terminology关于术语的快速旁白

A true unit test means you have complete control over every component in the test.真正的单元测试意味着您可以完全控制测试中的每个组件。 There can be no interaction with databases, REST calls, file systems, or even the system clock;不能与数据库、REST 调用、文件系统甚至系统时钟进行交互; everything has to be "doubled" (eg mocked, stubbed, etc) as Gerard Mezaros puts it in xUnit Test Patterns .正如 Gerard Mezaros 在xUnit Test Patterns中所说,一切都必须“加倍”(例如模拟、存根等)。 I know this seems like semantics, but it really matters.我知道这看起来像是语义,但它确实很重要。 Failing to understand this is one major reason why you see intermittent test failures in continuous integration.未能理解这一点是您在持续集成中看到间歇性测试失败的主要原因之一。

We Can Still Unit Test我们仍然可以进行单元测试

So given this understanding, unit testing an RDD is impossible.因此,鉴于这种理解,对RDD进行单元测试是不可能的。 However, there is still a place for unit testing when developing analytics.但是,在开发分析时仍然有单元测试的地方。

Consider a simple operation:考虑一个简单的操作:

rdd.map(foo).map(bar)

Here foo and bar are simple functions.这里foobar是简单的函数。 Those can be unit tested in the normal way, and they should be with as many corner cases as you can muster.这些可以以正常方式进行单元测试,并且它们应该包含尽可能多的极端案例。 After all, why do they care where they are getting their inputs from whether it is a test fixture or an RDD ?毕竟,他们为什么要关心从哪里获得输入,无论是测试夹具还是RDD

Don't Forget the Spark Shell不要忘记 Spark Shell

This isn't testing per se , but in these early stages you also should be experimenting in the Spark shell to figure out your transformations and especially the consequences of your approach.本身并不是测试,但在这些早期阶段,您还应该在 Spark shell 中进行试验,以找出您的转换,尤其是您的方法的后果。 For example, you can examine physical and logical query plans, partitioning strategy and preservation, and the state of your data with many different functions like toDebugString , explain , glom , show , printSchema , and so on.例如,您可以使用许多不同的函数(如toDebugStringexplainglomshowprintSchema等)检查物理和逻辑查询计划、分区策略和保存以及数据状态。 I will let you explore those.我会让你探索这些。

You can also set your master to local[2] in the Spark shell and in your tests to identify any problems that may only arise once you start to distribute work.您还可以在 Spark shell 和测试中将您的 master 设置为local[2] ,以识别只有在您开始分发工作时才可能出现的任何问题。

Integration Testing with Spark使用 Spark 进行集成测试

Now for the fun stuff.现在来看看有趣的东西。

In order to integration test Spark after you feel confident in the quality of your helper functions and RDD / DataFrame transformation logic, it is critical to do a few things (regardless of build tool and test framework):为了在您对辅助函数和RDD / DataFrame转换逻辑的质量充满信心之后对 Spark 进行集成测试,关键是要做一些事情(无论构建工具和测试框架如何):

  • Increase JVM memory.增加 JVM 内存。
  • Enable forking but disable parallel execution.启用分叉但禁用并行执行。
  • Use your test framework to accumulate your Spark integration tests into suites, and initialize the SparkContext before all tests and stop it after all tests.使用您的测试框架将 Spark 集成测试累积到套件中,并在所有测试之前初始化SparkContext并在所有测试之后停止它。

With ScalaTest, you can mix in BeforeAndAfterAll (which I prefer generally) or BeforeAndAfterEach as @ShankarKoirala does to initialize and tear down Spark artifacts.使用 ScalaTest,您可以像 @ShankarKoirala 那样混合使用BeforeAndAfterAll (我通常更喜欢)或BeforeAndAfterEach来初始化和拆除 Spark 工件。 I know this is a reasonable place to make an exception, but I really don't like those mutable var s you have to use though.我知道这是一个合理的例外情况,但我真的不喜欢你必须使用的那些可变var

The Loan Pattern贷款模式

Another approach is to use the Loan Pattern .另一种方法是使用贷款模式

For example (using ScalaTest):例如(使用 ScalaTest):

class MySpec extends WordSpec with Matchers with SparkContextSetup {
  "My analytics" should {
    "calculate the right thing" in withSparkContext { (sparkContext) =>
      val data = Seq(...)
      val rdd = sparkContext.parallelize(data)
      val total = rdd.map(...).filter(...).map(...).reduce(_ + _)

      total shouldBe 1000
    }
  }
}

trait SparkContextSetup {
  def withSparkContext(testMethod: (SparkContext) => Any) {
    val conf = new SparkConf()
      .setMaster("local")
      .setAppName("Spark test")
    val sparkContext = new SparkContext(conf)
    try {
      testMethod(sparkContext)
    }
    finally sparkContext.stop()
  }
} 

As you can see, the Loan Pattern makes use of higher-order functions to "loan" the SparkContext to the test and then to dispose of it after it's done.如您所见,贷款模式利用高阶函数将SparkContext “借”给测试,然后在完成后将其处理掉。

Suffering-Oriented Programming (Thanks, Nathan)面向痛苦的编程(谢谢,Nathan)

It is totally a matter of preference, but I prefer to use the Loan Pattern and wire things up myself as long as I can before bringing in another framework.这完全是一个偏好问题,但在引入另一个框架之前,我更喜欢使用贷款模式并尽可能地自行连接。 Aside from just trying to stay lightweight, frameworks sometimes add a lot of "magic" that makes debugging test failures hard to reason about.除了试图保持轻量级之外,框架有时会添加很多“魔法”,使调试测试失败难以推理。 So I take a Suffering-Oriented Programming approach--where I avoid adding a new framework until the pain of not having it is too much to bear.所以我采用了一种 面向痛苦的编程方法——在这种方法中,我避免添加一个新框架,直到没有它的痛苦无法承受。 But again, this is up to you.但同样,这取决于你。

The best choice for that alternate framework is of course spark-testing-base as @ShankarKoirala mentioned.正如@ShankarKoirala 提到的,该替代框架的最佳选择当然是spark-testing-base In that case, the test above would look like this:在这种情况下,上面的测试将如下所示:

class MySpec extends WordSpec with Matchers with SharedSparkContext {
      "My analytics" should {
        "calculate the right thing" in { 
          val data = Seq(...)
          val rdd = sc.parallelize(data)
          val total = rdd.map(...).filter(...).map(...).reduce(_ + _)
    
          total shouldBe 1000
        }
      }
 }

Note how I didn't have to do anything to deal with the SparkContext .请注意,我不必做任何事情来处理SparkContext SharedSparkContext gave me all that--with sc as the SparkContext --for free. SharedSparkContext给了我所有这些——用sc作为SparkContext免费。 Personally though I wouldn't bring in this dependency for just this purpose since the Loan Pattern does exactly what I need for that.就个人而言,虽然我不会仅仅为了这个目的而引入这种依赖关系,因为贷款模式正是我所需要的。 Also, with so much unpredictability that happens with distributed systems, it can be a real pain to have to trace through the magic that happens in the source code of a third-party library when things go wrong in continuous integration.此外,由于分布式系统发生了如此多的不可预测性,当持续集成中出现问题时,必须追溯第三方库源代码中发生的魔法可能是一件非常痛苦的事情。

Now where spark-testing-base really shines is with the Hadoop-based helpers like HDFSClusterLike and YARNClusterLike .现在spark-testing-base真正的亮点在于HDFSClusterLikeYARNClusterLike等基于 Hadoop 的助手。 Mixing those traits in can really save you a lot of setup pain.将这些特征混合在一起确实可以为您节省很多设置痛苦。 Another place where it shines is with the Scalacheck -like properties and generators--assuming of course you understand how property-based testing works and why it is useful.它的另一个亮点是使用类似 Scalacheck的属性和生成器——当然,假设您了解基于属性的测试是如何工作的以及它为什么有用。 But again, I would personally hold off on using it until my analytics and my tests reach that level of sophistication.但同样,我个人会推迟使用它,直到我的分析和测试达到那个复杂程度。

"Only a Sith deals in absolutes." “只有西斯才是绝对的。” -- Obi-Wan Kenobi ——欧比旺·克诺比

Of course, you don't have to choose one or the other either.当然,您也不必二选一。 Perhaps you could use the Loan Pattern approach for most of your tests and spark-testing-base only for a few, more rigorous tests.也许您可以将 Loan Pattern 方法用于大多数测试,而spark-testing-base仅用于少数更严格的测试。 The choice isn't binary;选择不是二元的。 you can do both.你可以两者都做。

Integration Testing with Spark Streaming使用 Spark Streaming 进行集成测试

Finally, I would just like to present a snippet of what a SparkStreaming integration test setup with in-memory values might look like without spark-testing-base :最后,我想展示一个带有内存值的 SparkStreaming 集成测试设置在没有spark-testing-base 的情况下可能看起来像这样的片段:

val sparkContext: SparkContext = ...
val data: Seq[(String, String)] = Seq(("a", "1"), ("b", "2"), ("c", "3"))
val rdd: RDD[(String, String)] = sparkContext.parallelize(data)
val strings: mutable.Queue[RDD[(String, String)]] = mutable.Queue.empty[RDD[(String, String)]]
val streamingContext = new StreamingContext(sparkContext, Seconds(1))
val dStream: InputDStream = streamingContext.queueStream(strings)
strings += rdd

This is simpler than it looks.这比看起来简单。 It really just turns a sequence of data into a queue to feed to the DStream .它实际上只是将数据序列转换为队列以提供给DStream Most of it is really just boilerplate setup that works with the Spark APIs.其中大部分实际上只是适用于 Spark API 的样板设置。 Regardless, you can compare this with StreamingSuiteBase as found in spark-testing-base to decide which you prefer.无论如何,您可以将其与spark-testing-base StreamingSuiteBase进行比较,以决定您更喜欢哪个。

This might be my longest post ever, so I will leave it here.这可能是我有史以来最长的帖子,所以我将把它留在这里。 I hope others chime in with other ideas to help improve the quality of our analytics with the same agile software engineering practices that have improved all other application development.我希望其他人能加入其他想法,以帮助提高我们的分析质量,使用相同的敏捷软件工程实践,这些实践已经改进了所有其他应用程序开发。

And with apologies for the shameless plug, you can check out our course Software Engineering with Apache Spark , where we address a lot of these ideas and more.并且为无耻的插件道歉,您可以查看我们的课程软件工程与 Apache Spark ,我们在其中解决了很多这些想法等等。 We hope to have an online version soon.我们希望尽快有一个在线版本。

You can write a simple test with FunSuite and BeforeAndAfterEach like below您可以使用 FunSuite 和 BeforeAndAfterEach 编写一个简单的测试,如下所示

class Tests extends FunSuite with BeforeAndAfterEach {

  var sparkSession : SparkSession = _
  override def beforeEach() {
    sparkSession = SparkSession.builder().appName("udf testings")
      .master("local")
      .config("", "")
      .getOrCreate()
  }

  test("your test name here"){
    //your unit test assert here like below
    assert("True".toLowerCase == "true")
  }

  override def afterEach() {
    sparkSession.stop()
  }
}

You don't need to create a functions in test you can simply write as你不需要在测试中创建一个函数,你可以简单地写成

test ("test name") {//implementation and assert}

Holden Karau has written really nice test spark-testing-base Holden Karau 写了非常好的测试spark-testing-base

You need to check out below is a simple example你需要看看下面是一个简单的例子

class TestSharedSparkContext extends FunSuite with SharedSparkContext {

  val expectedResult = List(("a", 3),("b", 2),("c", 4))

  test("Word counts should be equal to expected") {
    verifyWordCount(Seq("c a a b a c b c c"))
  }

  def verifyWordCount(seq: Seq[String]): Unit = {
    assertResult(expectedResult)(new WordCount().transform(sc.makeRDD(seq)).collect().toList)
  }
}

Hope this helps!希望这可以帮助!

Since Spark 1.6 you could use SharedSparkContext or SharedSQLContext that Spark uses for its own unit tests:Spark 1.6开始,您可以使用 Spark 用于其自己的单元测试的SharedSparkContextSharedSQLContext

class YourAppTest extends SharedSQLContext {

  var app: YourApp = _

  protected override def beforeAll(): Unit = {
    super.beforeAll()

    app = new YourApp
  }

  protected override def afterAll(): Unit = {
    super.afterAll()
  }

  test("Your test") {
    val df = sqlContext.read.json("examples/src/main/resources/people.json")

    app.run(df)
  }

Since Spark 2.3 SharedSparkSession is available:由于Spark 2.3 SharedSparkSession可用:

class YourAppTest extends SharedSparkSession {

  var app: YourApp = _

  protected override def beforeAll(): Unit = {
    super.beforeAll()

    app = new YourApp
  }

  protected override def afterAll(): Unit = {
    super.afterAll()
  }

  test("Your test") {
    df = spark.read.json("examples/src/main/resources/people.json")

    app.run(df)
  }

UPDATE:更新:

Maven dependency: Maven依赖:

<dependency>
  <groupId>org.scalactic</groupId>
  <artifactId>scalactic</artifactId>
  <version>SCALATEST_VERSION</version>
</dependency>
<dependency>
  <groupId>org.scalatest</groupId>
  <artifactId>scalatest</artifactId>
  <version>SCALATEST_VERSION</version>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-core</artifactId>
  <version>SPARK_VERSION</version>
  <type>test-jar</type>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-sql</artifactId>
  <version>SPARK_VERSION</version>
  <type>test-jar</type>
  <scope>test</scope>
</dependency>

SBT dependency: SBT 依赖:

"org.scalactic" %% "scalactic" % SCALATEST_VERSION
"org.scalatest" %% "scalatest" % SCALATEST_VERSION % "test"
"org.apache.spark" %% "spark-core" % SPARK_VERSION % Test classifier "tests"
"org.apache.spark" %% "spark-sql" % SPARK_VERSION % Test classifier "tests"

In addition, you could check test sources of Spark where there is a huge set of various test suits.此外,您可以查看 Spark 的测试源,那里有大量的各种测试套件。

UPDATE 2:更新 2:

Apache Spark Unit Testing Part 1 — Core Components Apache Spark 单元测试第 1 部分 — 核心组件

Apache Spark Unit Testing Part 2 — Spark SQL Apache Spark 单元测试第 2 部分 — Spark SQL

Apache Spark Unit Testing Part 3 — Streaming Apache Spark 单元测试第 3 部分 - 流式传输

Apache Spark Integration Testing Apache Spark 集成测试

Test Driven Development of Apache Spark applications Apache Spark 应用程序的测试驱动开发

I like to create a SparkSessionTestWrapper trait that can be mixed in to test classes.我喜欢创建一个可以混入测试类的SparkSessionTestWrapper特征。 Shankar's approach works, but it's prohibitively slow for test suites with multiple files. Shankar 的方法有效,但对于具有多个文件的测试套件来说,它的速度非常慢。

import org.apache.spark.sql.SparkSession

trait SparkSessionTestWrapper {

  lazy val spark: SparkSession = {
    SparkSession.builder().master("local").appName("spark session").getOrCreate()
  }

}

The trait can be used as follows:该特征可以按如下方式使用:

class DatasetSpec extends FunSpec with SparkSessionTestWrapper {

  import spark.implicits._

  describe("#count") {

    it("returns a count of all the rows in a DataFrame") {

      val sourceDF = Seq(
        ("jets"),
        ("barcelona")
      ).toDF("team")

      assert(sourceDF.count === 2)

    }

  }

}

Check the spark-spec project for a real-life example that uses the SparkSessionTestWrapper approach.查看spark-spec项目以获取使用SparkSessionTestWrapper方法的真实示例。

Update更新

The spark-testing-base library automatically adds the SparkSession when certain traits are mixed in to the test class (eg when DataFrameSuiteBase is mixed in, you'll have access to the SparkSession via the spark variable).当某些特征混入测试类时, spark-testing-base 库会自动添加 SparkSession(例如,当DataFrameSuiteBase时,您将可以通过spark变量访问 SparkSession)。

I created a separate testing library called spark-fast-tests to give the users full control of the SparkSession when running their tests.我创建了一个名为spark-fast-tests的单独测试库,让用户在运行测试时可以完全控制 SparkSession。 I don't think a test helper library should set the SparkSession.我不认为测试助手库应该设置 SparkSession。 Users should be able to start and stop their SparkSession as they see fit (I like to create one SparkSession and use it throughout the test suite run).用户应该能够在他们认为合适的时候启动和停止他们的 SparkSession(我喜欢创建一个 SparkSession 并在整个测试套件运行过程中使用它)。

Here's an example of the spark-fast-tests assertSmallDatasetEquality method in action:下面是 spark-fast-tests assertSmallDatasetEquality方法的示例:

import com.github.mrpowers.spark.fast.tests.DatasetComparer

class DatasetSpec extends FunSpec with SparkSessionTestWrapper with DatasetComparer {

  import spark.implicits._

    it("aliases a DataFrame") {

      val sourceDF = Seq(
        ("jose"),
        ("li"),
        ("luisa")
      ).toDF("name")

      val actualDF = sourceDF.select(col("name").alias("student"))

      val expectedDF = Seq(
        ("jose"),
        ("li"),
        ("luisa")
      ).toDF("student")

      assertSmallDatasetEquality(actualDF, expectedDF)

    }

  }

}

I could solve the problem with below code我可以用下面的代码解决问题

spark-hive dependency is added in project pom在项目 pom 中添加 spark-hive 依赖项

class DataFrameTest extends FunSuite with DataFrameSuiteBase{
        test("test dataframe"){
        val sparkSession=spark
        import sparkSession.implicits._
        var df=sparkSession.read.format("csv").load("path/to/csv")
        //rest of the operations.
        }
        }

Another way to Unit Test using JUnit使用 JUnit 进行单元测试的另一种方法

import org.apache.spark.sql.SparkSession
import org.junit.Assert._
import org.junit.{After, Before, _}

@Test
class SessionSparkTest {
  var spark: SparkSession = _

  @Before
  def beforeFunction(): Unit = {
    //spark = SessionSpark.getSparkSession()
    spark = SparkSession.builder().appName("App Name").master("local").getOrCreate()
    System.out.println("Before Function")
  }

  @After
  def afterFunction(): Unit = {
    spark.stop()
    System.out.println("After Function")
  }

  @Test
  def testRddCount() = {
    val rdd = spark.sparkContext.parallelize(List(1, 2, 3))
    val count = rdd.count()
    assertTrue(3 == count)
  }

  @Test
  def testDfNotEmpty() = {
    val sqlContext = spark.sqlContext
    import sqlContext.implicits._
    val numDf = spark.sparkContext.parallelize(List(1, 2, 3)).toDF("nums")
    assertFalse(numDf.head(1).isEmpty)
  }

  @Test
  def testDfEmpty() = {
    val sqlContext = spark.sqlContext
    import sqlContext.implicits._
    val emptyDf = spark.sqlContext.createDataset(spark.sparkContext.emptyRDD[Num])
    assertTrue(emptyDf.head(1).isEmpty)
  }
}

case class Num(id: Int)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何编写Spark Streaming程序的单元测试? - How to write unit tests for Spark Streaming programs? 如何编写scala单元测试来比较spark数据帧? - How to write scala unit tests to compare spark dataframes? 对于基本数据框创建示例,我应该如何在Spark中编写单元测试? - How should I write unit tests in Spark, for a basic data frame creation example? 如何在单元测试中抑制Spark记录? - How to suppress Spark logging in unit tests? Spark 2.0+即使数据帧被缓存,如果其中一个源更改,它会重新计算? - Spark 2.0+ Even the dataframe is cached, if one of its source changes, it would recompute? 在单元测试中模拟Spark RDD - Mock a Spark RDD in the unit tests 如何为从 json 文件读取的 spark 应用程序编写单元测试 - How to write Unit Test for spark app reading from json file Scala / Lift:如何编写单元测试来测试片段对不同参数的响应 - Scala / Lift: How do I write unit tests that test a snippet's response to different parameters 如何编写可用于运行/连接到Hadoop集群的Spark-scala测试? - How to write spark-scala tests available to run/connect to a hadoop cluster? 从带有简单参数的单元测试中调用主(spark)应用程序 - Invoke a main (spark) application from unit tests with simple parameters
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM