简体   繁体   English

将List Iterator传递给Java中的多个线程

[英]Passing a List Iterator to multiple Threads in Java

I have a list that contains roughly 200K elements. 我有一个包含大约200K元素的列表。

Am I able to pass the iterator for this list to multiple threads and have them iterate over the whole lot, without any of them accessing the same elements? 我能够将此列表的迭代器传递给多个线程并让它们遍历整个批次,而没有任何访问相同的元素吗?

This is what I am thinking of at the moment. 这就是我现在想到的。

Main: 主要:

public static void main(String[] args)
{
    // Imagine this list has the 200,000 elements.
    ArrayList<Integer> list = new ArrayList<Integer>();

    // Get the iterator for the list.
    Iterator<Integer> i = list.iterator();

    // Create MyThread, passing in the iterator for the list.
    MyThread threadOne = new MyThread(i);
    MyThread threadTwo = new MyThread(i);
    MyThread threadThree = new MyThread(i);

    // Start the threads.
    threadOne.start();
    threadTwo.start();
    threadThree.start();
}

MyThread: MyThread的:

public class MyThread extends Thread
{

    Iterator<Integer> i;

    public MyThread(Iterator<Integer> i)
    {
        this.i = i;
    }

    public void run()
    {
        while (this.i.hasNext()) {
            Integer num = this.i.next();
            // Do something with num here.
        }
    }
}

My desired outcome here is that each thread would process roughly 66,000 elements each, without locking up the iterator too much, and also without any of the threads accessing the same element. 我期望的结果是每个线程每个处理大约66,000个元素,而不会过多地锁定迭代器,并且没有任何线程访问相同的元素。

Does this sound doable? 这听起来有用吗?

Do you really need to manipulate threads and iterators manually? 真的需要手动操作线程和迭代器吗? You could use Java 8 Stream s and let parallel() do the job. 您可以使用Java 8 Stream并让parallel()完成这项工作。

By default, it will use one less thread as you have processors. 默认情况下,它将使用少一个线程,因为您有处理器。

Example : 示例:

list.stream()
    .parallel()
    .forEach(this::doSomething)
;

//For example, display the current integer and the current thread number.
public void doSomething(Integer i) {
  System.out.println(String.format("%d, %d", i, Thread.currentThread().getId()));
}

Result : 结果:

49748, 13
49749, 13
49750, 13
192710, 14
105734, 17
105735, 17
105736, 17
[...]

Edit : if you are using maven, you will need to add this piece of configuration in pom.xml in order to use Java 8 : 编辑:如果您使用的是maven,则需要在pom.xml中添加此配置才能使用Java 8:

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-compiler-plugin</artifactId>
      <version>3.3</version>
      <configuration>
        <source>1.8</source>
        <target>1.8</target>
      </configuration>
    </plugin>
  </plugins>
</build>

You can't do it in a thread safe way with a single iterator. 您不能使用单个迭代器以线程安全的方式执行此操作。 I suggest to use sublists: 我建议使用子列表:

List sub1 = list.subList(0, 100);
List sub2 = list.subList(100, 200);

ArrayList#subList() method will just wrap the given list without copying elements. ArrayList#subList()方法将只包装给定的列表而不复制元素。 Then you can iterate each subList in a different thread. 然后,您可以在不同的线程中迭代每个subList。

Since next() method of the class that implements Iterator interface does data manipulation, concurrent usage of next() method needs synchronization. 由于实现Iterator接口的类的next()方法执行数据操作,因此next()方法的并发使用需要同步。 The synchronization can be accomplished using synchronized block on iterator object as follows: 可以使用迭代器对象上的synchronized块完成同步,如下所示:

synchronized(i)
{
    i.next();
}

Though, I recommend the usage of Stream API as in the answer above if your need is only parallel processing of the list. 但是,如果您只需要并行处理列表,我建议在上面的答案中使用Stream API。

Hi to prevent your threads from dreadlocks or starvation you can use the ExecutorService from the thread pool class. 您好,为了防止您的线程从长发绺或饥饿,您可以使用线程池类中的ExecutorService。 This words better for me than using synchronized, locks or Re-entrant-locks. 对于我来说,这比使用synchronized,lock或Re-entrant-locks更好。 You can also try using the Fork/join but i haven't used it before. 您也可以尝试使用Fork / join,但我之前没有使用它。 This is a sample code but i hope you get the idea 这是一个示例代码,但我希望你能得到这个想法

public static void main(String[] args){
   ExecutorService executor = Executors.newFixedThreadPool(200000);
   List<Future<Integer>> futureList = new ArrayList<>();
   //iteration code goes here
  executor.shutdown();
}

Public class MyThread implements Callable<ArrayList<Integer>>{

@Override
        public Iterator<Integer> call() throws Exception {
            //code goes here!
        }  

}

If you use a parallel stream, you'll be executing your code across many threads, with the elements distributed evenly between threads: 如果使用并行流,则将跨多个线程执行代码,并在线程之间均匀分布元素:

list.parallelStream().forEach(this::processInteger);

This approach makes it really simple to code; 这种方法使编码变得非常简单; all the heavy lifting is done by the JRE. 所有繁重的工作都由JRE完成。

Also, regarding your code, it is bad style to extend Thread . 另外,关于你的代码,扩展Thread是一种糟糕的风格。 Instead, implement Runnable and pass an instance to the constructor of Thread - see live 相反,实现Runnable并将实例传递给Thread的构造函数 - 请参阅live

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM