简体   繁体   中英

How to count the amount of times a word shows up in text file in Java?

So I'm pretty new to Java and I'm working on a code that is supposed to read a .txt file that the user inputs and then ask the user for a word to search for within the .txt file. I'm having trouble figuring out how to count the amount of times the inputted word shows up in the .txt file. Instead, the code I have is only counting the amount of lines the code shows up in. Can anyone help me figure out what to do to have my program count the amount of times the word shows up instead of the amount of lines the word shows up in? Thank you! Here's the code:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class TextSearch {

    public static void main(String[] args) throws FileNotFoundException {
        Scanner txt;
        File file = null;
        String Default = "/eng/home/tylorkun/workspace/09.1/src/Sample.txt";

        try {
            txt = new Scanner(System.in);
            System.out.print("Please enter the text file name or type  'Default' for a default file. ");
            file = new File(txt.nextLine());

            txt = new Scanner(file);

            while (txt.hasNextLine()) {
                String line = txt.nextLine();
                System.out.println(line);
            }
            txt.close();
        } catch (Exception ex) {
            ex.printStackTrace();
        }

        try {
            txt = new Scanner(file);
            Scanner in = new Scanner(System.in);
            in.nextLine();
            System.out.print("Please enter a string to search for. Please do not enter a string longer than 16 characters. ");
            String wordInput = in.nextLine();

            //If too long
            if (wordInput.length() > 16) {
                System.out.println("Please do not enter a string longer than 16 characters. Try again. ");
                wordInput = in.nextLine();
            }

            //Search
            int count = 0;
            while (txt.hasNextLine()) //Should txt be in? 
            {
                String line = txt.nextLine();
                count++;
                if (line.contains(wordInput)) //count > 0
                {
                    System.out.println("'" + wordInput + "' was found " + count + " times in this document. ");
                    break;
                }
            //else
                //{
                //    System.out.println("Word was not found. ");
                //}
            }
        } catch (FileNotFoundException e) {
            System.out.println("Word was not found. ");
        }
    } //main ends
} //TextSearch ends

Your problem is that you are incrementing count on each line, regardless of whether the word is present. Also, you have no code to count multiple matches per line.

Instead, use a regex search to find the matches, and increment count for each match found:

//Search
int count = 0;
Pattern = Pattern.compile(wordInput, Pattern.LITERAL | Pattern.CASE_INSENSITIVE);
while(txt.hasNextLine()){
    Matcher m = pattern.matcher(txt.nextLine());

    // Loop through all matches
    while (m.find()) {
        count++;
    }
}

NOTE: Not sure what you are using this for, but if you just need the functionality you can combine the grep and wc (wordcount) command-line utilities. See this SO Answer for how to do that.

Since the word doesn't have to be standalone, you can do an interesting for loop to count how many times your word appears in each line.

public static void main(String[] args) throws Exception {
    String wordToSearch = "the";
    String data = "the their father them therefore then";
    int count = 0;
    for (int index = data.indexOf(wordToSearch); 
             index != -1; 
             index = data.indexOf(wordToSearch, index + 1)) {
        count++;
    }

    System.out.println(count);
}

Results:

6

So the searching segment of your code could look like:

//Search
int count = 0;
while (txt.hasNextLine()) 
{
    String line = txt.nextLine();
    for (int index = line.indexOf(wordInput); 
             index != -1; 
             index = line.indexOf(wordInput, index + 1)) {
        count++;
    }        
}

System.out.println(count);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM