[英]count characters, words and lines in file
This should count number of lines, words and characters into file. 这应该将行数,单词数和字符数计入文件中。
But it doesn't work. 但这是行不通的。 From output it shows only
0
. 从输出中它仅显示
0
。
Code: 码:
public static void main(String[] args) throws IOException {
int ch;
boolean prev = true;
//counters
int charsCount = 0;
int wordsCount = 0;
int linesCount = 0;
Scanner in = null;
File selectedFile = null;
JFileChooser chooser = new JFileChooser();
// choose file
if (chooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
selectedFile = chooser.getSelectedFile();
in = new Scanner(selectedFile);
}
// count the characters of the file till the end
while(in.hasNext()) {
ch = in.next().charAt(0);
if (ch != ' ') ++charsCount;
if (!prev && ch == ' ') ++wordsCount;
// don't count if previous char is space
if (ch == ' ')
prev = true;
else
prev = false;
if (ch == '\n') ++linesCount;
}
//display the count of characters, words, and lines
charsCount -= linesCount * 2;
wordsCount += linesCount;
System.out.println("# of chars: " + charsCount);
System.out.println("# of words: " + wordsCount);
System.out.println("# of lines: " + linesCount);
in.close();
}
I can't understand what's going on. 我不明白发生了什么。 Any suggestions?
有什么建议么?
Your code is looking at only the first characters of default tokens (words) in the file. 您的代码仅查看文件中默认标记(单词)的前几个字符。
When you do this ch = in.next().charAt(0)
, it gets you the first character of a token (word), and the scanner moves forward to the next token (skipping rest of that token). 当您执行
ch = in.next().charAt(0)
,它将获取令牌的第一个字符(单词),然后扫描程序将前进至下一个令牌(跳过该令牌的其余部分)。
Different approach. 不同的方法。 Using strings to find line,word and character counts:
使用字符串查找行数,单词数和字符数:
public static void main(String[] args) throws IOException {
//counters
int charsCount = 0;
int wordsCount = 0;
int linesCount = 0;
Scanner in = null;
File selectedFile = null;
JFileChooser chooser = new JFileChooser();
// choose file
if (chooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
selectedFile = chooser.getSelectedFile();
in = new Scanner(selectedFile);
}
while (in.hasNext()) {
String tmpStr = in.nextLine();
if (!tmpStr.equalsIgnoreCase("")) {
String replaceAll = tmpStr.replaceAll("\\s+", "");
charsCount += replaceAll.length();
wordsCount += tmpStr.split(" ").length;
}
++linesCount;
}
//display the count of characters, words, and lines
System.out.println("# of chars: " + charsCount);
System.out.println("# of words: " + wordsCount);
System.out.println("# of lines: " + linesCount);
in.close();
}
new Scanner(new File(selectedFile), "###");
new Scanner(new File(selectedFile), "###");
in place of new Scanner(selectedFile);
new Scanner(selectedFile);
.
###
is the Character set to needed. ###
是需要设置的字符。 Refer this and wiki 引用这个和维基
You have a couple of issues in here. 这里有几个问题。
First is the test for the end of line is going to cause problems since it usually isn't a single character denoting end of line. 首先是对行尾的测试将导致问题,因为它通常不是表示行尾的单个字符。 Read http://en.wikipedia.org/wiki/End-of-line for more detail on this issue.
请阅读http://en.wikipedia.org/wiki/在线结尾,以获取有关此问题的更多详细信息。
The whitespace character between words can be more than just the ASCII 32 (space) value. 单词之间的空白字符不仅可以是ASCII 32(空格)值。 Consider tabs as one case.
将制表符视为一种情况。 You want to check for Character.isWhitespace() more than likely.
您想要检查Character.isWhitespace()的可能性更高。
You could also solve the end of line issues with two scanners found in How to check the end of line using Scanner? 您也可以使用如何使用扫描仪检查行尾中的两个扫描仪来解决行尾问题。
Here is a quick hack on the code you provided along with input and output. 这是您与输入和输出一起提供的代码的快速技巧。
import java.io.*;
import java.util.Scanner;
import javax.swing.JFileChooser;
public final class TextApp {
public static void main(String[] args) throws IOException {
//counters
int charsCount = 0;
int wordsCount = 0;
int linesCount = 0;
Scanner fileScanner = null;
File selectedFile = null;
JFileChooser chooser = new JFileChooser();
// choose file
if (chooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
selectedFile = chooser.getSelectedFile();
fileScanner = new Scanner(selectedFile);
}
while (fileScanner.hasNextLine()) {
linesCount++;
String line = fileScanner.nextLine();
Scanner lineScanner = new Scanner(line);
// count the characters of the file till the end
while(lineScanner.hasNext()) {
wordsCount++;
String word = lineScanner.next();
charsCount += word.length();
}
lineScanner.close();
}
//display the count of characters, words, and lines
System.out.println("# of chars: " + charsCount);
System.out.println("# of words: " + wordsCount);
System.out.println("# of lines: " + linesCount);
fileScanner.close();
}
}
Here is the test file input: 这是测试文件输入:
$ cat ../test.txt
test text goes here
and here
Here is the output: 这是输出:
$ javac TextApp.java
$ java TextApp
# of chars: 23
# of words: 6
# of lines: 2
$ wc test.txt
2 6 29 test.txt
The difference between character count is due to not counting whitespace characters which appears to be what you were trying to do in the original code. 字符计数之间的差异是由于不计算空白字符,这似乎是您在原始代码中尝试执行的操作。
I hope that helps out. 希望对您有所帮助。
You could store every line in a List<String>
and then linesCount = list.size()
. 您可以将每行存储在
List<String>
,然后将linesCount = list.size()
。
Calculating charsCount
: 计算
charsCount
:
for(final String line : lines)
charsCount += line.length();
Calculating wordsCount
: 计算
wordsCount
:
for(final String line : lines)
wordsCount += line.split(" +").length;
It would probably be a wise idea to combine these calculations together as opposed to doing them seperately. 将这些计算组合在一起而不是单独进行计算,可能是一个明智的想法。
Use Scanner
methods: 使用
Scanner
方法:
int lines = 0;
int words = 0;
int chars = 0;
while(in.hasNextLine()) {
lines++;
Scanner lineScanner = new Scanner(in.nextLine());
lineScanner.useDelimiter(" ");
while(lineScanner.hasNext()) {
words++;
chars += lineScanner.next().length();
}
}
Looks like everyone is suggesting you an alternative, 似乎每个人都在建议您另一种选择,
The flaw with your logic is, you are not looping through the all the characters for the entire line. 逻辑上的缺陷是,您没有遍历整行的所有字符。 You are just looping through the first character of every line.
您只是循环浏览每行的第一个字符。
ch = in.next().charAt(0);
Also, what does 2 in charsCount -= linesCount * 2;
另外,
charsCount -= linesCount * 2;
represent? 代表?
You might also want to include a try-catch block, while accessing a file. 您可能还希望在访问文件时包括try-catch块。
try {
in = new Scanner(selectedFile);
} catch (FileNotFoundException e) {}
Maybe my code will help you...everything work correct 也许我的代码可以帮助您...一切正常
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.Scanner;
import java.util.StringTokenizer;
public class LineWordChar {
public static void main(String[] args) throws IOException {
// Convert our text file to string
String text = new Scanner( new File("way to your file"), "UTF-8" ).useDelimiter("\\A").next();
BufferedReader bf=new BufferedReader(new FileReader("way to your file"));
String lines="";
int linesi=0;
int words=0;
int chars=0;
String s="";
// while next lines are present in file int linesi will add 1
while ((lines=bf.readLine())!=null){
linesi++;}
// Tokenizer separate our big string "Text" to little string and count them
StringTokenizer st=new StringTokenizer(text);
while (st.hasMoreTokens()){
`enter code here` s = st.nextToken();
words++;
// We take every word during separation and count number of char in this words
for (int i = 0; i < s.length(); i++) {
chars++;}
}
System.out.println("Number of lines: "+linesi);
System.out.println("Number of words: "+words);
System.out.print("Number of chars: "+chars);
}
}
public class WordCount {
/**
* @return HashMap a map containing the Character count, Word count and
* Sentence count
* @throws FileNotFoundException
*
*/
public static void main() throws FileNotFoundException {
lineNumber=2; // as u want
File f = null;
ArrayList<Integer> list=new ArrayList<Integer>();
f = new File("file.txt");
Scanner sc = new Scanner(f);
int totalLines=0;
int totalWords=0;
int totalChars=0;
int totalSentences=0;
while(sc.hasNextLine())
{
totalLines++;
if(totalLines==lineNumber){
String line = sc.nextLine();
totalChars += line.length();
totalWords += new StringTokenizer(line, " ,").countTokens(); //line.split("\\s").length;
totalSentences += line.split("\\.").length;
break;
}
sc.nextLine();
}
list.add(totalChars);
list.add(totalWords);
list.add(totalSentences);
System.out.println(lineNumber+";"+totalWords+";"+totalChars+";"+totalSentences);
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.