我如何計算有多少個單詞，並忽略字符串中的相同單詞？ (使用方法)

Question

這里的代碼只顯示了它們有多少個單詞，我如何忽略相同的單詞？ 例如，“很久很久以前，我還記得”，將返回 8 而不是 9。

我希望它是一種方法，它接受一個 String 類型的參數並返回一個 int 值。 而且我只允許使用基礎知識，所以沒有 hash 鍵或 hash 設置和推進的東西。

  public static int mostCommonLetter(String s){

                    int wordCount = 0;

                    boolean word = false;
                    int endOfLine = s.length() - 1;

                    for (int i = 0; i < s.length(); i++) {
                       
                        if (Character.isLetter(s.charAt(i)) && i != endOfLine) {
                            word = true;
                          
                        } else if (!Character.isLetter(s.charAt(i)) && word) {
                            wordCount++;
                            word = false;
                          
                        } else if (Character.isLetter(s.charAt(i)) && i == endOfLine) {
                            wordCount++;
                        }
                    }
                    return wordCount;
                }
}

我如何忽略相同的單詞？

Answer 1

import java.util.*;

public class MyClass {
    public static void main(String args[]) {
      String input = "A long long time ago, I can still remember";
      String[] words = input.split(" ");
      List<String> uniqueWords = new ArrayList<>();
      for (String word : words) {
        if (!uniqueWords.contains(word)) {
            uniqueWords.add(word);
        }      
      }
      System.out.println("Number of unique words: " + uniqueWords.size());
    }
}

Output：唯一單詞數：8

基本上，如果您被允許使用列表等數據結構，您可以做的是創建一個列表，當且僅當它們不存在時，將輸入句子的單詞放入列表中。

Answer 2

大概的概念：

public int getUniqueWords(String input) {
    // Split the string into words using the split() method
    String[] words = input.split(" ");

    // Create a Set to store the unique words
    Set<String> uniqueWords = new HashSet<String>();

    // Loop through the words and add them to the Set
    for (String word : words) {
        uniqueWords.add(word);
    }

    // Return unique words amount
    return uniqueWords.size();
}

使用 StreamAPI 的相同解決方案：

public int getUniqueWords2(String input) {
    // here we can safely cast to int, because String can contain at most "max int" chars
    return (int) Arrays.stream(input.split(" ")).distinct().count();
}

如果需要處理單詞之間的多個空格，請為input添加一些清理：

// remove leading and trailing spaces
cleanInput = input.trim();

// replace multiple spaces with a single space
cleanInput = cleanInput.replaceAll("\\s+", " ");

考慮到“允許使用基本知識”的要求：

哈希表（HashSet）是算法中的一種基本數據結構
如果沒有包含“已經看到”項目的容器，則無法邏輯地解決計算唯一項目的問題，因此算法可以檢查下一個項目是否被計算在內
在容器的角色中，在最簡單的情況下可能是一個列表，但這會導致O(n^2)時間復雜度。

Answer 3

您可以使用Set<T>集合類型，它只能包含唯一值：

public static int getTotalUniqueWords(String input) {
    String[] words = input.split(" ");
    Set<String> uniqueWords = new HashSet<>();
    Collections.addAll(uniqueWords, words);
    return uniqueWords.size();
}

或使用流：

public static long getTotalUniqueWordsStream(String input) {
    String[] words = input.split(" ");
    return Arrays.stream(words).distinct().count();
}

我如何計算有多少個單詞，並忽略字符串中的相同單詞？ (使用方法)

問題描述

3 個解決方案

解決方案1
1 2022-12-03 21:41:18

解決方案2
1 2022-12-03 21:46:14

解決方案3
0 2022-12-04 01:36:12

我如何計算有多少個單詞，並忽略字符串中的相同單詞？ (使用方法)

問題描述

3 個解決方案

解決方案1 1 2022-12-03 21:41:18

解決方案2 1 2022-12-03 21:46:14

解決方案3 0 2022-12-04 01:36:12

解決方案1
1 2022-12-03 21:41:18

解決方案2
1 2022-12-03 21:46:14

解決方案3
0 2022-12-04 01:36:12