简体   繁体   English

Spark Task Executors工作时如何在Java并发Java List中存储多个json对象

[英]How to store multiple json objects in java Concurrent Java List while Spark Task Executors do work

I am trying to populate certain List type of data structures via multiple Spark Task executors. 我正在尝试通过多个Spark Task执行程序填充某些列表类型的数据结构。 So, I am looking for atomicity. 因此,我正在寻找原子性。

So, I have say 10 rows . 所以,我说有10行。 Each row has say m key value pairs . 每行有说m个键值对。 key1-val1, ....keym-valm. key1-val1,.... keym-valm。

Now My Task executors are trying to ingest these rows in a database like dynamodb. 现在,“我的任务”执行程序正在尝试将这些行吸收到dynamodb这样的数据库中。 My db ingestor has OnSuccess OnFailure handlers written. 我的数据库摄取器已编写了OnSuccess OnFailure处理程序。 I want to know can I ensure I have a "concurrent" List with 10 items where each item points to one row ie each row has m key value pairs. 我想知道是否可以确保我有一个包含10个项目的“并发”列表,其中每个项目都指向一行,即每一行有m个键值对。

Which data structure to use. 使用哪种数据结构。 Since this is invoked by task executor I thought of using LinkedBlockingQueue. 由于这是由任务执行者调用的,因此我想到了使用LinkedBlockingQueue。 But what would be the exact Collection. 但是确切的集合是什么。

Does this BlockingQueue look OK ? 这个BlockingQueue看起来还好吗? But how would each element in blocking queue contain a list of key value pairs ? 但是阻塞队列中的每个元素如何包含键值对列表?

If you are looking to accumulate the result of a task in Spark, you should use the accumulator framework of spark. 如果要在Spark中累积任务的结果,则应使用spark的累积器框架。 You read about the framework here:- https://spark.apache.org/docs/2.2.0/rdd-programming-guide.html#accumulators . 您可以在这里阅读有关该框架的信息: -https : //spark.apache.org/docs/2.2.0/rdd-programming-guide.html#accumulators

In the case of plane java concurrency, if you just want to store the value from different threads, then instead of using a blocking queue, you can simply use ConcurrentHashMap. 对于平面Java并发,如果您只想存储来自不同线程的值,则可以使用ConcurrentHashMap来代替使用阻塞队列。 where the key would be your number 1 to 10 and the value can be of type ConcurrentLinkedQueue, which can contain the key-value pair. 其中键是您的数字1到10,值可以是ConcurrentLinkedQueue类型,其中可以包含键值对。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM