简体   繁体   English

在迭代它时从java中的集合中删除项目

[英]Removing items from a collection in java while iterating over it

I want to be able to remove multiple elements from a set while I am iterating over it. 我希望能够在迭代它时从一个集合中删除多个元素。 Initially I hoped that iterators were smart enough for the naive solution below to work. 最初我希望迭代器足够智能,以便下面的天真解决方案能够工作。

Set<SomeClass> set = new HashSet<SomeClass>();
fillSet(set);
Iterator<SomeClass> it = set.iterator();
while (it.hasNext()) {
    set.removeAll(setOfElementsToRemove(it.next()));
}

But this throws a ConcurrentModificationException . 但是这会引发ConcurrentModificationException

Note that iterator.remove() will not work as far as I can see because I need to remove multiple things at a time. 请注意,iterator.remove()将无法正常工作,因为我需要一次删除多个东西。 Also assume that it is not possible to identify which elements to remove "on the fly", but it is possible to write the method setOfElementsToRemove() . 还假设无法识别“动态”删除哪些元素,但可以编写方法setOfElementsToRemove() In my specific case it would take up a lot of memory and processing time to determine what to remove while iterating. 在我的特定情况下,它将占用大量内存和处理时间来确定迭代时要删除的内容。 Making copies is also not possible because of memory constraints. 由于内存限制,也无法进行复制。

setOfElementsToRemove() will generate some set of SomeClass instances that I want to remove, and fillSet(set) will fill the set with entries. setOfElementsToRemove()将生成一些我想要删除的SomeClass实例集, fillSet(set)将使用条目填充集合。

After searching Stack Overflow I could not find a good solution to this problem but a few hours break later I realized the following would do the job. 在搜索Stack Overflow之后,我找不到一个很好的解决方案来解决这个问题,但是几个小时后我才意识到以下情况可以解决这个问题。

Set<SomeClass> set = new HashSet<SomeClass>();
Set<SomeClass> outputSet = new HashSet<SomeClass>();
fillSet(set);
while (!set.isEmpty()) {
    Iterator<SomeClass> it = set.iterator();
    SomeClass instance = it.next();
    outputSet.add(instance);
    set.removeAll(setOfElementsToRemoveIncludingThePassedValue(instance));
}

setOfElementsToRemoveIncludingThePassedValue() will generate a set of elements to remove that includes the value passed to it. setOfElementsToRemoveIncludingThePassedValue()将生成一组要删除的元素,包括传递给它的值。 We need to remove the passed value so set will empty. 我们需要删除传递的值,因此set将为空。

My question is whether anyone has a better way of doing this or whether there are collection operations that support these kind of removals. 我的问题是,是否有人有更好的方法这样做,或者是否有支持这种删除的收集操作。

Also, I thought I would post my solution because there seems to be a need and I wanted to contribute the the excellent resource that is Stack Overflow. 此外,我想我会发布我的解决方案,因为似乎有需要,我想贡献Stack Overflow的优秀资源。

Normally when you remove an element from a collection while looping over the collection, you'll get a Concurrent Modification Exception . 通常,当您在循环集合时从集合中删除元素时,您将获得并发修改异常 This is partially why the Iterator interface has a remove() method. 这部分是Iterator接口具有remove()方法的部分原因。 Using an iterator is the only safe way to modify a collection of elements while traversing them. 使用迭代器是在遍历它们时修改元素集合的唯一安全方法。

The code would go something like this: 代码将是这样的:

Set<SomeClass> set = new HashSet<SomeClass>();
fillSet(set);
Iterator<SomeClass> setIterator = set.iterator();
while (setIterator.hasNext()) {
    SomeClass currentElement = setIterator.next();
    if (setOfElementsToRemove(currentElement).size() > 0) {
        setIterator.remove();
    }
}

This way you'll safely remove all elements that generate a removal set from your setOfElementsToRemove(). 这样您就可以安全地从setOfElementsToRemove()中删除所有生成删除集的元素。

EDIT 编辑

Based on a comment to another answer, this may be more what you want: 基于对另一个答案的评论,这可能更符合您的要求:

Set<SomeClass> set = new HashSet<SomeClass>();
Set<SomeClass> removalSet = new HashSet<SomeClass>();
fillSet(set);

for (SomeClass currentElement : set) {
    removalSet.addAll(setOfElementsToRemove(currentElement);
}

set.removeAll(removalSet);

Instead of iterating through all the elements in the Set to remove the ones you want, you can actually use Google Collections (not something you can't do it on your own though) and apply a Predicate to mask the ones you don't need. 您可以实际使用Google集合(而不是您自己无法做到的事情)而不是遍历集合中的所有元素来删除所需的元素,并应用谓词来掩盖您不需要的集合。 。

package com.stackoverflow.q1675037;

import java.util.HashSet;
import java.util.Set;

import org.junit.Assert;
import org.junit.Test;

import com.google.common.base.Predicate;
import com.google.common.collect.Iterables;
import com.google.common.collect.Sets;


public class SetTest
{
public void testFilter(final Set<String> original, final Set<String> toRemove, final Set<String> expected)
{

    Iterable<String> mask = Iterables.filter(original, new Predicate<String>()
    {
        @Override
        public boolean apply(String next) {
        return !toRemove.contains(next);
        }
    });

    HashSet<String> filtered = Sets.newHashSet(mask);

    Assert.assertEquals(original.size() - toRemove.size(), filtered.size());
    Assert.assertEquals(expected, filtered);        
}


@Test
public void testFilterNone()
{
    Set<String> original = new HashSet<String>(){
        {
            this.add("foo");
            this.add("bar");
            this.add("foobar");
        }
    };

    Set<String> toRemove = new HashSet();

    Set<String> expected = new HashSet<String>(){
        {
            this.add("foo");                
            this.add("bar");
            this.add("foobar");
        }
    };

    this.testFilter(original, toRemove, expected);
}

@Test
public void testFilterAll()
{
    Set<String> original = new HashSet<String>(){
        {
            this.add("foo");
            this.add("bar");
            this.add("foobar");
        }
    };

    Set<String> toRemove = new HashSet<String>(){
        {
            this.add("foo");
            this.add("bar");
            this.add("foobar");
        }
    };

    HashSet<String> expected = new HashSet<String>();
    this.testFilter(original, toRemove, expected);
}    

@Test
public void testFilterOne()
{
    Set<String> original = new HashSet<String>(){
        {
            this.add("foo");
            this.add("bar");
            this.add("foobar");
        }
    };

    Set<String> toRemove = new HashSet<String>(){
        {
            this.add("foo");
        }
    };

    Set<String> expected = new HashSet<String>(){
        {
            this.add("bar");
            this.add("foobar");
        }
    };

    this.testFilter(original, toRemove, expected);
}    


@Test
public void testFilterSome()
{
    Set<String> original = new HashSet<String>(){
        {
            this.add("foo");
            this.add("bar");
            this.add("foobar");
        }
    };

   Set<String> toRemove = new HashSet<String>(){
        {
            this.add("bar");
            this.add("foobar");
        }
    };

    Set<String> expected = new HashSet<String>(){
        {
            this.add("foo");
        }
    };

    this.testFilter(original, toRemove, expected);
}    
}

Any solution that involves removing from the set you're iterating while you're iterating it, but not via the iterator, will absolutely not work. 任何涉及从迭代中删除的集合中进行迭代但不通过迭代器的解决方案绝对不会起作用。 Except possibly one: you could use a Collections.newSetFromMap(new ConcurrentHashMap<SomeClass, Boolean>( sizing params )) . 除了可能的一个:你可以使用Collections.newSetFromMap(new ConcurrentHashMap<SomeClass, Boolean>( sizing params )) The catch is that now your iterator is only weakly consistent , meaning that each time you remove an element that you haven't encountered yet, it's undefined whether that element will show up later in your iteration or not. 问题是,现在你的迭代器只是微弱的一致 ,这意味着每次你删除一个你还没有遇到过的元素时,这个元素是否会在你的迭代中显示出来是未定义的。 If that's not a problem, this might work for you. 如果这不是问题,这可能对你有用。

Another thing you can do is build up a toRemove set as you go instead, then set.removeAll(itemsToRemove); 您可以做的另一件事是建立一个toRemove集合,然后set.removeAll(itemsToRemove); only at the end. 只在最后。 Or, copy the set before you start, so you can iterate one copy while removing from the other. 或者,在开始之前复制该集,这样您可以在从另一个副本移除时迭代一个副本。

EDIT: oops, I see Peter Nix had already suggested the toRemove idea (although with an unnecessarily hand-rolled removeAll ). 编辑:哎呀,我看到toRemove已经建议了toRemove想法(虽然有一个不必要的手动toRemove removeAll )。

You could try the java.util.concurrent.CopyOnWriteArraySet which gives you an iterator that is a snapshot of the set at the time of the iterator creation. 您可以尝试java.util.concurrent.CopyOnWriteArraySet ,它为您提供一个迭代器,它是迭代器创建时集的快照。 Any changes you make to the set (ie by calling removeAll() ) won't be visible in the iterator, but are visible if you look at the set itself (and removeAll() won't throw). 您对该集所做的任何更改(即通过调用removeAll() )将在迭代器中不可见,但如果您查看该集本身(并且removeAll()将不会抛出)则可见。

有一个简单的答案 - 使用Iterator.remove()方法。

If you have enough memory for one copy of the set, I'll assume you also have enough memory for two copies. 如果你有一个副本的足够的内存,我会假设你有足够的内存两个副本。 The Kafka-esque rules you cite don't seem to forbid that :) 您引用的Kafka-esque规则似乎并不禁止:)

My suggestion, then: 我的建议是:

fillSet(set);
fillSet(copy);
for (Object item : copy) {
   if (set.contains(item)) { // ignore if not
     set.removeAll(setOfStuffToRemove())
   }
}

so copy stays intact and just provides the stuff to loop on, while set suffers deletions. 所以复制保持不变,只是提供循环的东西,而设置遭受删除。 Stuff that was removed from set in the meantime will be ignored. 在此期间从集合中删除的东西将被忽略。

Why don't you use the iterator's remove method on the objects you want to remove? 为什么不在要删除的对象上使用迭代器的remove方法

Iterators were introduced mainly because enumerators couldn't handle deleting while enumerating. 引入迭代器主要是因为枚举器在枚举时无法处理删除。

You should call Iterator.remove method. 你应该调用Iterator.remove方法。

Also note, that on most java.util collections the remove method will generate exception if the contents of the collection have changed. 另请注意,在大多数java.util集合中,如果集合的内容已更改,则remove方法将生成异常。 So, if the code is multi-threaded use extra caution, or use concurrent collections. 因此,如果代码是多线程的,请格外小心,或使用并发集合。

It is possible to implement a Set that allows its elements to be removed whilst iterating over it. 可以实现一个Set ,允许在迭代它时删除它的元素。

I think the standard implementations (HashSet, TreeSet etc.) disallow it because that means they can use more efficient algorithms, but it's not hard to do. 我认为标准实现(HashSet,TreeSet等)不允许它,因为这意味着它们可以使用更高效的算法,但这并不难。

Here's an incomplete example using Google Collections: 以下是使用Google Collections的不完整示例:

import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;

import com.google.common.base.Predicates;
import com.google.common.collect.ForwardingSet;
import com.google.common.collect.Iterators;
import com.google.common.collect.Sets;

public class ConcurrentlyModifiableSet<E>
extends ForwardingSet<E> {
 /** Create a new, empty set */
 public ConcurrentlyModifiableSet() {
  Map<E, Boolean> map = new ConcurrentHashMap<E, Boolean>();
  delegate = Sets.newSetFromMap(map);
 }

 @Override
 public Iterator<E> iterator() {
  return Iterators.filter(delegate.iterator(), Predicates.in(delegate));
 }

 @Override
 protected Set<E> delegate() {
  return this.delegate;
 }

 private Set<E> delegate;
}

Note: The iterator does not support the remove() operation (but the example in the question does not require it.) 注意:迭代器不支持remove()操作(但问题中的示例不需要它。)

Copied from the Java API : Java API复制:

The List interface provides a special iterator, called a ListIterator, that allows element insertion and replacement, and bidirectional access in addition to the normal operations that the Iterator interface provides. List接口提供了一个特殊的迭代器,称为ListIterator, 它允许元素插入和替换,以及Iterator接口提供的常规操作之外的双向访问。 A method is provided to obtain a list iterator that starts at a specified position in the list. 提供了一种方法来获得从列表中的指定位置开始的列表迭代器。

I thought I would point out that the ListIterator which is a special kind of Iterator is built for replacement. 我想我会指出ListIterator是一种特殊的Iterator,它是为了替换而构建的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM