简体   繁体   English

java如何使用扫描仪读取和计算段落

[英]java how to use scanner to read and count paragraph

For example, if I have the following lines of text in a file:例如,如果我在文件中有以下几行文本:

this is an example.这是一个例子。 this is an example.这是一个例子。

this is an example.这是一个例子。 this is an example.这是一个例子。 this is an example这是一个例子

this is an example this is an example this is an example this is an example this is an example this is an example this is an example this is an example this is an example this is an example.这是一个例子这是一个例子这是一个例子这是一个例子这是一个例子这是一个例子这是一个例子这是一个例子这是一个例子

I want to be able to count these lines as 3 paragraphs.我希望能够将这些行算作 3 段。 Now my code will count this as 4 paragraphs, as it does not know when a paragraph begins and ends.现在我的代码将把它算作 4 个段落,因为它不知道一个段落何时开始和结束。

Scanner file = new Scanner(new FileInputStream("../.../output.txt"));
int count = 0;
while (file.hasNextLine()) { //whilst scanner has more lines
    Scanner s = new Scanner(file.nextLine());
    if(!file.hasNext()){
        break;
    }
    else{
        file.nextLine();
        count++;
    }
    s.close();
}
System.out.println("Number of paragraphs: "+ count);
file.close();

This is what I have so far.这是我到目前为止。 It reads lines of text, and treats each line as a single paragraph.它读取文本行,并将每一行视为一个段落。

I want it to treat lines of text that don't have any empty line between them as 1 paragraph and count all paragraphs in file.我希望它将它们之间没有任何空行的文本行视为 1 个段落,并计算文件中的所有段落。

Scanner probably isn't the best choice if you only want to count lines.如果您只想计算行数,扫描仪可能不是最佳选择。 BufferedReader is probably better. BufferedReader 可能更好。

    BufferedReader in = new BufferedReader(new FileReader("output.txt"));
    String line = in.readLine();
    int count = 0;
    StringBuilder paragraph = new StringBuilder();
    while (true) {
        if (line==null || line.trim().length() == 0) {
            count++;
            System.out.println("paragraph " + count + ":" + paragraph.toString());
            paragraph.setLength(0);
            if(line == null)
                break;
        } else {
            paragraph.append(" ");
            paragraph.append(line);
        }
        line = in.readLine();
    }
    in.close();
    System.out.println("Number of paragaphs: "+ count);    

You will not be able to see the spaces or newline characters using Scanner.您将无法使用 Scanner 看到空格或换行符。 nextLine() method eliminates the \\n's. nextLine()方法消除了 \\n。

You need to use a class and methods that reads the bytes of the file so you can detect the spaces and newline characters.您需要使用读取文件字节的类和方法,以便检测空格和换行符。

Try to use read() method of FileInputStream .尝试使用FileInputStream read()方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM