简体   繁体   中英

How to count how many times a keyword appears in a tax in java?

Question 1:

I am trying to count the frequency of a keyword, my code works except that it also counts those words that also contain the keyword (for example, if I search "count", words like "account" will also be counted in.) Does someone know how to solve this?

Question 2:

I also wanna count the the number of unique words in a text (which means I count repeated word only once). I don't know how to achieve this either. My code only gives me the number of total words.

Here is my code:

import java.util.Scanner;


public class Text_minining {

/**
 * @param args
 */
public static void main(String[] args) {

    //Prompt the user for the search word
    System.out.print("enter a search word: ");
    //Get the user's search word input
    Scanner keywordScanner = new Scanner(System.in);
    String keyword = keywordScanner.nextLine();
    keyword = keyword.toLowerCase();

    //Prompt the user for the text
    System.out.println("Enter a string of words (words separated by single spaces or tabs): ");
    //Get the user's string input
    Scanner userInputScanner = new Scanner(System.in);
    String userInput = userInputScanner.nextLine();
    userInput = userInput.toLowerCase();

    int keywordCount = 0, wordCount = 0;
    int lastIndex = 0;

    while(lastIndex != -1){
        lastIndex = userInput.indexOf(keyword,lastIndex);
        if(lastIndex != -1){
            keywordCount ++;
            lastIndex = keyword.length() + lastIndex;
        }
    }

    boolean wasSpace=true;
    for (int i = 0; i < userInput.length(); i++) 
    {
        if (userInput.charAt(i) == ' ') {
            wasSpace=true;
        }
        else{
            if(wasSpace == true) wordCount++;
            wasSpace = false;
            }
    }

    //Print the results to the screen
    System.out.println("-------");
    System.out.println("Good, \"" + keyword + "\"appears in the text and the word count is " + keywordCount);
    System.out.println("The total number of unique words in the text is " + wordCount);


    System.exit(0);
}
    }

First: userInput.split(keyword).length - 1 will do the trick. Our use regex.

Second:

Set<String> uniqueWords = new HashSet<String>();
for (String word : userInput.split(" ")) {
   uniqueWords.add(word);
}
System.out.println("Unique words count " + uniqueWords.size());

Just use string method split.

String words[] = userInput.split(keyword);

and then check and count the keyword...

for ( String w : words) {
   // do check
}

Agree. Use split to create the array and then you can use

(new HashSet(Arrays.asList(yourArray))).size(); 

to find the count

I would suggest you this approach:

  • Split userInput string by white spaces: userInput.split("\\\\s+") . You will get an array. See String.split()
  • For question 1: iterate over the array comparing each string with your keyword . See String.equals() and String.equalsIgnoreCase() .
  • For question 2: add the array to a Set . As this can't contain any duplicate item, its size will give you the answer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM