简体   繁体   中英

Sharing Zookeeper configuration on multiple Spark Executors

I have a configuration information written in Zookeeper. I'm using Apache Curator to read the configuration(If there is a better solution for reading it, i'm happy to use it) with Curator Watcher so if the configuration is changed in Zookeeper, i'll receive the new one. I'm using this configuration in Spark. How can I share it to all spark executors of the same application?

Thank you!

LE:

Thank you Dikei,

In the following code, where would you do the watcher implementation? I'm new to spark and I'm not exactly sure what goes to each worker.

Thank you!

final JavaDStream<ElementMessage> nodeMessageStream = mapWithStateDistinctAndFiltered.flatMap(pair -> pair._2.buildElementMessages())
            .filter(f -> f != null);

    nodeMessageStream.foreachRDD(rdd -> {
        rdd.foreachPartition(r -> {
            final ElementRecordRestClient rest = new ElementRecordRestClient(
                    startProps.getProperty(InputPropertyKey.WEPAPP_URL.toString()));
            r.forEachRemaining(message -> {
                rest.createObject(message.toElementRecord());
            });
        });
    });

What I would do in this case is to run the Curator Watcher on the master node, and broadcast the configuration to all executor using Spark's broadcast variable. Whenever the configuration changed, you stop the current streaming context, and start a new one with the new configuration. This will ensure that your result are always consistent.

The other way would be reading zookeeper configuration inside the foreachPartition lambda function. But because the configuration is read independently by each partitions, different partitions of the same RDD can get different configurations, which might not be what you expected.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM