簡體   English   中英

計算文件中的字符,單詞和行數

[英]count characters, words and lines in file

這應該將行數,單詞數和字符數計入文件中。

但這是行不通的。 從輸出中它僅顯示0

碼:

public static void main(String[] args) throws IOException {
    int ch;
    boolean prev = true;        
    //counters
    int charsCount = 0;
    int wordsCount = 0;
    int linesCount = 0;

    Scanner in = null;
    File selectedFile = null;
    JFileChooser chooser = new JFileChooser();
    // choose file 
    if (chooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
        selectedFile = chooser.getSelectedFile();
        in = new Scanner(selectedFile);         
    }

    // count the characters of the file till the end
    while(in.hasNext()) {
        ch = in.next().charAt(0);
        if (ch != ' ') ++charsCount;
        if (!prev && ch == ' ') ++wordsCount;
        // don't count if previous char is space
        if (ch == ' ') 
            prev = true;
        else 
            prev = false;

        if (ch == '\n') ++linesCount;
    }

    //display the count of characters, words, and lines
    charsCount -= linesCount * 2;
    wordsCount += linesCount;
    System.out.println("# of chars: " + charsCount);
    System.out.println("# of words: " + wordsCount);
    System.out.println("# of lines: " + linesCount);

    in.close();
}

我不明白發生了什么。 有什么建議么?

您的代碼僅查看文件中默認標記(單詞)的前幾個字符。

當您執行ch = in.next().charAt(0) ,它將獲取令牌的第一個字符(單詞),然后掃描程序將前進至下一個令牌(跳過該令牌的其余部分)。

不同的方法。 使用字符串查找行數,單詞數和字符數:

public static void main(String[] args) throws IOException {
        //counters
        int charsCount = 0;
        int wordsCount = 0;
        int linesCount = 0;

        Scanner in = null;
        File selectedFile = null;
        JFileChooser chooser = new JFileChooser();
        // choose file 
        if (chooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
            selectedFile = chooser.getSelectedFile();
            in = new Scanner(selectedFile);
        }

        while (in.hasNext()) {
            String tmpStr = in.nextLine();
            if (!tmpStr.equalsIgnoreCase("")) {
                String replaceAll = tmpStr.replaceAll("\\s+", "");
                charsCount += replaceAll.length();
                wordsCount += tmpStr.split(" ").length;
            }
            ++linesCount;
        }

        //display the count of characters, words, and lines
        System.out.println("# of chars: " + charsCount);
        System.out.println("# of words: " + wordsCount);
        System.out.println("# of lines: " + linesCount);

        in.close();
    }


注意:
對於其他編碼樣式,請使用new Scanner(new File(selectedFile), "###"); 代替new Scanner(selectedFile);

###是需要設置的字符。 引用這個維基

這里有幾個問題。

首先是對行尾的測試將導致問題,因為它通常不是表示行尾的單個字符。 請閱讀http://en.wikipedia.org/wiki/在線結尾,以獲取有關此問題的更多詳細信息。

單詞之間的空白字符不僅可以是ASCII 32(空格)值。 將制表符視為一種情況。 您想要檢查Character.isWhitespace()的可能性更高。

您也可以使用如何使用掃描儀檢查行尾中的兩個掃描儀來解決行尾問題

這是您與輸入和輸出一起提供的代碼的快速技巧。

import java.io.*;
import java.util.Scanner;
import javax.swing.JFileChooser;

public final class TextApp {

public static void main(String[] args) throws IOException {
    //counters
    int charsCount = 0;
    int wordsCount = 0;
    int linesCount = 0;

    Scanner fileScanner = null;
    File selectedFile = null;
    JFileChooser chooser = new JFileChooser();
    // choose file 
    if (chooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
        selectedFile = chooser.getSelectedFile();
        fileScanner = new Scanner(selectedFile);         
    }

    while (fileScanner.hasNextLine()) {
      linesCount++;
      String line = fileScanner.nextLine();
      Scanner lineScanner = new Scanner(line);
      // count the characters of the file till the end
      while(lineScanner.hasNext()) {
        wordsCount++;
        String word = lineScanner.next();
        charsCount += word.length();
      } 

    lineScanner.close();
  }

  //display the count of characters, words, and lines
  System.out.println("# of chars: " + charsCount);
  System.out.println("# of words: " + wordsCount);
  System.out.println("# of lines: " + linesCount);

  fileScanner.close();
 }
}

這是測試文件輸入:

$ cat ../test.txt 
test text goes here
and here

這是輸出:

$ javac TextApp.java
$ java TextApp 
# of chars: 23
# of words: 6
# of lines: 2
$ wc test.txt 
 2  6 29 test.txt

字符計數之間的差異是由於不計算空白字符,這似乎是您在原始代碼中嘗試執行的操作。

希望對您有所幫助。

您可以將每行存儲在List<String> ,然后將linesCount = list.size()

計算charsCount

for(final String line : lines)
    charsCount += line.length();

計算wordsCount

for(final String line : lines)
    wordsCount += line.split(" +").length;

將這些計算組合在一起而不是單獨進行計算,可能是一個明智的想法。

使用Scanner方法:

int lines = 0;
int words = 0;
int chars = 0;
while(in.hasNextLine()) {
    lines++;
    Scanner lineScanner = new Scanner(in.nextLine());
    lineScanner.useDelimiter(" ");
    while(lineScanner.hasNext()) {
        words++;
        chars += lineScanner.next().length();
    }
}

似乎每個人都在建議您另一種選擇,

邏輯上的缺陷是,您沒有遍歷整行的所有字符。 您只是循環瀏覽每行的第一個字符。

 ch = in.next().charAt(0);

另外, charsCount -= linesCount * 2; 代表?

您可能還希望在訪問文件時包括try-catch塊。

  try {
            in = new Scanner(selectedFile);
        } catch (FileNotFoundException e) {}

也許我的代碼可以幫助您...一切正常

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.Scanner;
import java.util.StringTokenizer;

public class LineWordChar {
    public static void main(String[] args) throws IOException {
        // Convert our text file to string
    String text = new Scanner( new File("way to your file"), "UTF-8" ).useDelimiter("\\A").next();
    BufferedReader bf=new BufferedReader(new FileReader("way to your file"));
    String lines="";
    int linesi=0;
    int words=0;
    int chars=0;
    String s="";
    // while next lines are present in file int linesi will add 1
        while ((lines=bf.readLine())!=null){
        linesi++;}
    // Tokenizer separate our big string "Text" to little string and count them
    StringTokenizer st=new StringTokenizer(text);
     while (st.hasMoreTokens()){
        `enter code here`  s = st.nextToken();
          words++;
    // We take every word during separation and count number of char in this words    
          for (int i = 0; i < s.length(); i++) {
              chars++;}
        }
     System.out.println("Number of lines: "+linesi);
     System.out.println("Number of words: "+words);
     System.out.print("Number of chars: "+chars);
 }
}
public class WordCount {

    /**
     * @return HashMap a map containing the Character count, Word count and
     *         Sentence count
     * @throws FileNotFoundException 
     *
     */
    public static void main() throws FileNotFoundException {
        lineNumber=2; // as u want
        File f = null;
        ArrayList<Integer> list=new ArrayList<Integer>();

        f = new File("file.txt");
        Scanner sc = new Scanner(f);
        int totalLines=0;
        int totalWords=0;
        int totalChars=0;
        int totalSentences=0;
        while(sc.hasNextLine())
        {
            totalLines++;
            if(totalLines==lineNumber){
                String line = sc.nextLine();
                totalChars += line.length();
                totalWords += new StringTokenizer(line, " ,").countTokens();  //line.split("\\s").length;
                totalSentences += line.split("\\.").length;
                break;
            }
            sc.nextLine();

        }

        list.add(totalChars);
        list.add(totalWords);
        list.add(totalSentences);
        System.out.println(lineNumber+";"+totalWords+";"+totalChars+";"+totalSentences);

    }
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM