简体   繁体   English

使用 Scala 在 Spark 中使用广播变量的正确语法是什么?

[英]What is the proper syntax of using broadcast variables in Spark using Scala?

I want to use a broadcast variable in Spark with Scala.我想在 Spark 和 Scala 中使用广播变量。 But I can't find enough help on how to use them.但是我找不到足够的关于如何使用它们的帮助。 Say, I have an object of class A, which I normally would declare as follows in Scala.说,我有一个 A 类的对象,我通常会在 Scala 中声明如下。

val a = new A()

What would be the syntax of declaring it as a broadcast variable.将其声明为广播变量的语法是什么。 And how would I call its methods?我将如何调用它的方法?

If sc is a SparkContext , then val broadcasted = sc.broadcast(a) will broadcast a .如果scSparkContext ,则val broadcasted = sc.broadcast(a)将广播a
You can then access it with broadcasted.value .然后您可以使用broadcasted.value访问它。

Marth is right.马斯是对的。 You need also to destroy a broadcast variable by using sc.destroy(blocking) where blocking is a flag.您还需要使用 sc.destroy(blocking) 销毁广播变量,其中阻塞是一个标志。 I want to highlight that is recommended to avoid to broadcast small variables.我想强调的是,建议避免广播小变量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM