简体   繁体   English

如何从 Java 列表中删除所有重复的字符串?

[英]How to remove all duplicated strings from a Java List?

For a given list, say [ "a", "a", "b", "c", "c" ] I need [ "b" ] ( only non duplicated elements ) as output.对于给定的列表,说[ "a", "a", "b", "c", "c" ]我需要[ "b" ]仅非重复元素)作为输出。 Note that this is different from using the Set interface for the job...请注意,这与对作业使用Set接口不同...

I wrote the following code to do this in Java:我编写了以下代码来在 Java 中执行此操作:

void unique(List<String> list) {
    Collections.sort(list);
    List<String> dup = new ArrayList<>();
    int i = 0, j = 0;

    for (String e : list) {
        i = list.indexOf(e);
        j = list.lastIndexOf(e);

        if (i != j && !dup.contains(e)) {
            dup.add(e);
        }
    }

    list.removeAll(dup);
}

It works... but for a list of size 85320, ends after several minutes!它有效......但是对于大小为 85320 的列表,几分钟后结束!

You best performance is with set:你最好的表现是设置:

    String[] xs = { "a", "a", "b", "c", "c" };

    Set<String> singles = new TreeSet<>();
    Set<String> multiples = new TreeSet<>();

    for (String x : xs) {
        if(!multiples.contains(x)){
            if(singles.contains(x)){
                singles.remove(x);
                multiples.add(x);
            }else{
                singles.add(x);
            }
        }
    }

It's a single pass and insert , remove and contains are log(n).这是单次传递,插入、删除和包含是 log(n)。

Using Java 8 streams:使用 Java 8 流:

return list.stream()
    .collect(Collectors.groupingBy(e -> e, Collectors.counting()))
    .entrySet()
    .stream()
    .filter(e -> e.getValue() == 1)
    .map(Map.Entry::getKey)
    .collect(Collectors.toList());

You can use streams to achieve this in simpler steps as shown below with inline comments:您可以使用streams以更简单的步骤实现这一点,如下所示,带有内联注释:

//Find out unique elements first
List<String> unique = list.stream().distinct().collect(Collectors.toList());

//List to collect output list
List<String> output = new ArrayList<>();

//Iterate over each unique element
for(String element : unique) {

    //if element found only ONCE add to output list
    if(list.stream().filter(e -> e.equals(element)).count() == 1) {
        output.add(element);
    }
}

you can use a Map.你可以使用地图。 do the following请执行下列操作

1. Create a map of following type Map<String, Integer>
2. for all elements
       check if the string is in hashmap
             if yes then increment the value of that map entry by 1
       else add <current element , 1>
3. now your output are those entries of the Map whose values are 1.

Given that you can sort the list, about the most efficient way to do this is to use a ListIterator to iterate over runs of adjacent elements:鉴于您可以对列表进行排序,最有效的方法是使用ListIterator迭代相邻元素的运行:

List<String> dup = new ArrayList<>();
Collections.sort(list);
ListIterator<String> it = list.listIterator();
while (it.hasNext()) {
  String first = it.next();

  // Count the number of elements equal to first.
  int cnt = 1;
  while (it.hasNext()) {
    String next = it.next();
    if (!first.equals(next)) {
        it.previous();
        break;
    }
    ++cnt;
  }

  // If there are more than 1 elements between i and start
  // it's duplicated. Otherwise, it's a singleton, so add it
  // to the output.
  if (cnt == 1) {
    dup.add(first);
  }
}

return dup;

ListIterator is more efficient for lists which don't support random access, like LinkedList , than using index-based access. ListIterator对于不支持随机访问的列表(如LinkedList )比使用基于索引的访问更有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM