简体繁体中英

How to broadcast a variable in a Spark Streaming mapping function?

原文 2016-07-15 06:03:24 6 1 java/ apache-kafka/ spark-streaming

I know the usual routine: sc.broadcast(x) .

However, currently Spark Streaming does not support broadcast variables with checkpointing.

The official guide provides a solution: http://spark.apache.org/docs/latest/streaming-programming-guide.html#accumulators-and-broadcast-variables . However, this solution can be only used for foreachRDD functions.

Now I want to use large or unserializable variables (like a KafkaProducer ) that need to be broadcast in this way in mapping functions (such as flatMapToPair ), but since there is no visible RDD variables, I cannot retrieve the Spark context to broadcast the lazy-evaluated variable. If I use the initial context for creating DStreams or the context retrieved from DStreams, the task becomes not serializable.

So how can I use broadcast variables in mapping functions? Or is there any workaround for using large or unserializable variables in mapping functions?

1 answers

I finally find the solution. To use these features, use the transform functions rather than the map functions. In the transform functions, we manually handle RDDs and apply map functions on them, so we can get the reference of RDDs and thus get the Spark context from them.

HashMap as a Broadcast Variable in Spark Streaming?

How can I update a broadcast variable in spark streaming?

how to create broadcast variable in spark 2(java)?

How to access Java Spark Broadcast variable?

Spark : Best way to Broadcast KafkaProducer to Spark streaming

BroadCast Variable publish in Spark Program

How do I pass Spark broadcast variable to a UDF in Java?

Spark streaming transform function

Java Spark broadcast variable in separate file

spark broadcast variable Map giving null value

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question HashMap as a Broadcast Variable in Spark Streaming? How can I update a broadcast variable in spark streaming? how to create broadcast variable in spark 2(java)? How to access Java Spark Broadcast variable? Spark : Best way to Broadcast KafkaProducer to Spark streaming BroadCast Variable publish in Spark Program How do I pass Spark broadcast variable to a UDF in Java? Spark streaming transform function Java Spark broadcast variable in separate file spark broadcast variable Map giving null value

Related Tags

How to broadcast a variable in a Spark Streaming mapping function?

Question

1 answers

solution1 0 2016-09-07 09:18:58

solution1
0 2016-09-07 09:18:58