删除并标记字符串数组中的重复项

Question

我有这个

String array[] = {"test","testing again", "test"};

我要标记和删除重复项。 这是我需要的输出：

2x test

testing again

有人可以帮我吗？ 我已经尝试过使用Set，但是当字符串已经在其中时，它似乎无法识别。

这是我的代码：

Set addons = new HashSet<String>();
final String[] arr ={"test","testing again", "test"};
            for (int i = 0; i < arr.length; i++) {
                Log.d(TAG, "contains adding " + arr[i]);

                if (addons.contains(arr[i])) {
                    //never enters here
                    Log.d(TAG, "contains " + arr[i]);
                    addons.remove(arr[i]);
                    addons.add("2 x " + arr[i]);
                } else {
                    addons.add("1 x " + arr[i]);
                }
            }

Answer 1

您可以执行以下操作：

String[] arr = { "test", "testing again", "test" };
HashMap<String, Integer> counter = new HashMap<>();
for (int i = 0; i < arr.length; i++) {
    if (counter.containsKey(arr[i])) {
        counter.put(arr[i], counter.get(arr[i]) + 1);
    } else {
        counter.put(arr[i], 1);
    }
}
System.out.println("Occurrences:\n");
for (String key : counter.keySet()) {
    System.out.println(key + " x" + counter.get(key));
}

因为，当你发现你删除它，并用类似替换单词的一个新出现的例如不工作2x [word] ，当这个词再次出现contains(...)将返回false ，因为它不再在集合中。

Answer 2

在Java 8中：

Stream.of("test", "testing again", "test")
        .collect(groupingBy(Function.identity(), counting()))
        .forEach((str, freq) -> {
            System.out.printf("%20s: %d%n", str, freq);
        });

Answer 3

尝试这个：

 public static void main(String[] args) {

        Set<String> addons = new HashSet<>();
        final String[] arr = { "test", "testing again", "test","test","testing again" };
        int count = 0;
        for (int i = 0; i < arr.length; i++) {

            for (int j = 0; j < arr.length; j++) {
                if (arr[i].equals(arr[j])) {
                    count++;
                }
            }

            addons.add(count + " x " + arr[i]);
            count = 0;
        }

        System.out.println(addons);

    }

输出：

[2 x testing again, 3 x test]

Answer 4

String[] arr ={"test","testing again", "test"};
Map<String, Integer> results = new HashMap<>();
for (int i = 0; i < arr.length; i++) {
  Log.d(TAG, "contains adding " + arr[i]);
  if (results.containsKey(arr[i])) {
      Log.d(TAG, "contains " + arr[i]);
      results.put(arr[i], results.get(arr[i]) + 1);
  } else {
      results.put(arr[i], 1);
  }
}

Answer 5

尝试以下代码。

String[] array ={"test","testing again","test"};
Set<String> uniqueWords = new HashSet<String>(Arrays.asList(array));

Answer 6

问题是您没有在集合中直接添加“测试”，而是添加了“ 1 x测试”。

因此，最好使用Map保存字符串及其出现的次数。

    String[] array = { "test", "testing again", "test" };
    Map<String, Integer> addons = new HashMap<>();

    for (String s : array) {
        System.out.println("Dealing with [" + s + "]");
        if (addons.containsKey(s)) {
            System.out.println("\tAlready contains [" + s  + "]");
            addons.put(s, addons.get(s) + 1); // increment count of s
        } else {
            System.out.println("\tFirst time seeing [" + s  + "]");
            addons.put(s, 1); // first time we encounter s
        }
    }

Answer 7

使用它，其中Map的Key是String元素，Value是该元素的Count。

public static void main(String[] args) {
        String array[] = {"test","testing again", "test"};

        Map<String, Integer> myMap = new HashMap<>();

        for (int i = 0; i < array.length; i++) {
            if (myMap.containsKey(array[i])) {
                Integer count = myMap.get(array[i]);
                myMap.put(array[i], ++count);
            } else{
                myMap.put(array[i], 1);
            }
        }

        System.out.println(myMap);
    }

Answer 8

String array[] = {"test","testing again", "test"};
Map<String, Integer> countMap = new HashMap<>();
for (int i = 0; i<array.length; i++) {
    Integer count = countMap.get(array[i]);
    if(count == null) {
        count = 0;
    }
    countMap.put(array[i], (count.intValue()+1));
}
System.out.println(countMap.toString());

输出量

{'test'=2, 'testing again'=1}

Answer 9

您可以从Guava使用Multiset。

String array[] = {"test","testing again", "test"};
Multiset<String> set = HashMultiset.create(Arrays.asList(array));
System.out.println(set);

输出：

[test x 2, testing again]

基本上，Multiset会计算您尝试添加对象的次数。

for (HashMultiset.Entry<String> entry :set.entrySet()) {
    System.out.println(entry.getCount() + "x " + entry.getElement());
}

输出：

2x test 
1x testing again

Answer 10

您可以使用自己的类来保存重复项：

class SetWithDuplicates extends HashSet<String> {

   private final Set<String> duplicates = new HashSet<>();

    @Override
    public boolean add(String e) {
       boolean added = super.add(e);
       if(!added) {
           duplicates.add(e);
       }
       return added;
    }

    public Set<String> duplicates() {
        return duplicates;
    }

}

并像@Ganpat Kaliya一样使用它：

String[] array ={"test","testing again","test"};
SetWithDuplicates <String> uniqueWords = new SetWithDuplicates(Arrays.asList(array));
SetWithDuplicates <String> duplicates = uniqueWords.duplicates();

删除并标记字符串数组中的重复项

问题描述

10 个解决方案

解决方案1
2 已采纳 2015-09-30 12:24:24

解决方案2
2 2015-09-30 12:38:44

解决方案3
1 2015-09-30 12:42:12

解决方案4
0 2015-09-30 12:26:06

解决方案5
0 2015-09-30 12:28:03

解决方案6
0 2015-09-30 12:28:41

解决方案7
0 2015-09-30 12:30:40

解决方案8
0 2015-09-30 12:33:25

解决方案9
0 2015-09-30 12:50:05

解决方案10
0 2015-09-30 12:59:39

删除并标记字符串数组中的重复项

问题描述

10 个解决方案

解决方案1 2 已采纳 2015-09-30 12:24:24

解决方案2 2 2015-09-30 12:38:44

解决方案3 1 2015-09-30 12:42:12

解决方案4 0 2015-09-30 12:26:06

解决方案5 0 2015-09-30 12:28:03

解决方案6 0 2015-09-30 12:28:41

解决方案7 0 2015-09-30 12:30:40

解决方案8 0 2015-09-30 12:33:25

解决方案9 0 2015-09-30 12:50:05

解决方案10 0 2015-09-30 12:59:39

解决方案1
2 已采纳 2015-09-30 12:24:24

解决方案2
2 2015-09-30 12:38:44

解决方案3
1 2015-09-30 12:42:12

解决方案4
0 2015-09-30 12:26:06

解决方案5
0 2015-09-30 12:28:03

解决方案6
0 2015-09-30 12:28:41

解决方案7
0 2015-09-30 12:30:40

解决方案8
0 2015-09-30 12:33:25

解决方案9
0 2015-09-30 12:50:05

解决方案10
0 2015-09-30 12:59:39