简体   繁体   中英

How do i count how many words there are, and ignore same words in a string? (using method)

The code here only shows how many words they are, how do i ignore the words that are the same? For example, "A long long time ago, I can still remember", would return 8 instead of 9.

I want it to be a method which takes one parameter s of type String and returns an int value. And im only allowed to use the bacics, so no hash keys or hash set and advance stuff.

  public static int mostCommonLetter(String s){

                    int wordCount = 0;

                    boolean word = false;
                    int endOfLine = s.length() - 1;

                    for (int i = 0; i < s.length(); i++) {
                       
                        if (Character.isLetter(s.charAt(i)) && i != endOfLine) {
                            word = true;
                          
                        } else if (!Character.isLetter(s.charAt(i)) && word) {
                            wordCount++;
                            word = false;
                          
                        } else if (Character.isLetter(s.charAt(i)) && i == endOfLine) {
                            wordCount++;
                        }
                    }
                    return wordCount;
                }
} 
        

How do i ignore the words that are the same?

import java.util.*;

public class MyClass {
    public static void main(String args[]) {
      String input = "A long long time ago, I can still remember";
      String[] words = input.split(" ");
      List<String> uniqueWords = new ArrayList<>();
      for (String word : words) {
        if (!uniqueWords.contains(word)) {
            uniqueWords.add(word);
        }      
      }
      System.out.println("Number of unique words: " + uniqueWords.size());
    }
}

Output: Number of unique words: 8

Basically, what you can do if you're allowed to use data structures like lists and so on, is create a list and put the words of the input sentence in the list if and only if they aren't already there.

General idea:

public int getUniqueWords(String input) {
    // Split the string into words using the split() method
    String[] words = input.split(" ");

    // Create a Set to store the unique words
    Set<String> uniqueWords = new HashSet<String>();

    // Loop through the words and add them to the Set
    for (String word : words) {
        uniqueWords.add(word);
    }

    // Return unique words amount
    return uniqueWords.size();
}

Same solution using StreamAPI:

public int getUniqueWords2(String input) {
    // here we can safely cast to int, because String can contain at most "max int" chars
    return (int) Arrays.stream(input.split(" ")).distinct().count();
}

If it is needed to handle multiple spaces between words, add some cleanup for input :

// remove leading and trailing spaces
cleanInput = input.trim();

// replace multiple spaces with a single space
cleanInput = cleanInput.replaceAll("\\s+", " "); 

Considering the requirement "allowed to use the bacics" :

  1. hashtable (HashSet) is a basic data structure in algorithms
  2. problem of counting unique items cannot be logically solved without a container holding "aready seen" items, so algorithm could check whether the next item is counted or not
  3. in the role of container in the simplest case could be a list, but that would cause O(n^2) time complexity.

You can use a Set<T> collection type, that can only contains unique values:

public static int getTotalUniqueWords(String input) {
    String[] words = input.split(" ");
    Set<String> uniqueWords = new HashSet<>();
    Collections.addAll(uniqueWords, words);
    return uniqueWords.size();
}

or with Streams:

public static long getTotalUniqueWordsStream(String input) {
    String[] words = input.split(" ");
    return Arrays.stream(words).distinct().count();
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM