繁体   English   中英


[英]What is the fastest way to find out how many non-empty lines are in a file, using Java?


最简单的方法是使用BufferedReader,并检查哪些行为空。 但是,这是一种相对较慢的方法,因为它需要为文件中的每一行创建一个String对象。 一种更快的方法是使用read()将文件读入数组,然后遍历数组以计算换行符。

这是两个选项的代码; 第二个大约花了我机器50%的时间。

public static void timeBufferedReader () throws IOException
    long bef = System.currentTimeMillis ();

    // The reader buffer size is the same as the array size I use in the other function
    BufferedReader reader = new BufferedReader(new FileReader("test.txt"), 1024 * 10);
    int counter = 0;
    while (reader.ready())
        if (reader.readLine().length() > 0)

    long after = System.currentTimeMillis() - bef;

    System.out.println("Time: " + after + " Result: " + counter);


public static void timeFileReader () throws IOException
    long bef = System.currentTimeMillis();

    FileReader reader = new FileReader("test.txt");
    char[] buf = new char[1024 * 10];
    boolean emptyLine = true;
    int     counter = 0;
    while (reader.ready())
        int len = reader.read(buf,0,buf.length);
        for (int i = 0; i < len; i++)
            if (buf[i] == '\r' || buf[i] == '\n')
                if (!emptyLine)
                    counter += 1;
                    emptyLine = true;
            else emptyLine = false;

    long after = System.currentTimeMillis() - bef;

    System.out.println("Time: " + after + " Result: " + counter);


关于NIO的建议,我支持Limbic System。 我在Daphna的测试代码中添加了NIO方法,并在他的两种方法中进行了标记:

public static void timeNioReader () throws IOException {
    long bef = System.currentTimeMillis();

    File file = new File("/Users/stu/test.txt");
    FileChannel fc = (new FileInputStream(file)).getChannel(); 
    MappedByteBuffer buf = fc.map(MapMode.READ_ONLY, 0, file.length());
    boolean emptyLine = true;
    int     counter = 0;

    while (buf.hasRemaining())
        byte element = buf.get();

        if (element == '\r' || element == '\n') {
            if (!emptyLine) {
                counter += 1;
                emptyLine = true;
        } else 
            emptyLine = false;


    long after = System.currentTimeMillis() - bef;

    System.out.println("timeNioReader      Time: " + after + " Result: " + counter);



timeBufferedReader Time: 947 Result: 747656
timeFileReader     Time: 670 Result: 747656
timeNioReader      Time: 251 Result: 747656



//jvm start, warming up
timeBufferedReader Time: 121 Result: 53404
timeFileReader     Time: 65 Result: 53404
timeNioReader      Time: 40 Result: 53404

//still warming up
timeBufferedReader Time: 107 Result: 53404
timeFileReader     Time: 60 Result: 53404
timeNioReader      Time: 20 Result: 53404

//ripping along
timeBufferedReader Time: 79 Result: 53404
timeFileReader     Time: 56 Result: 53404
timeNioReader      Time: 16 Result: 53404


最简单的方法是使用扫描仪(是的,我喜欢冗长的代码...您可以使其物理上更短)。 Scanner()还需要File,Reader等...,因此您可以随便传递它。

import java.util.Scanner;

public class Main
    public static void main(final String[] argv)
        final Scanner scanner;
        final int     lines;

        scanner = new Scanner("Hello\n\n\nEvil\n\nWorld");
        lines   = countLines(scanner);
        System.out.println("lines = "  + lines);

    private static int countLines(final Scanner scanner)
        int lines;

        lines = 0;

            final String line;

            line = scanner.nextLine();

            if(line.length() > 0)

        return lines;

如果确实必须尽可能快,则应考虑使用NIO 然后,在目标平台上测试代码,以查看使用NIO是否真的更好。 我为Netflix奖使用的某些代码获得了一个数量级的改进。 它涉及将数千个文件解析为更紧凑,快速加载的二进制格式。 NIO对我的(缓慢的)开发笔记本电脑有很大的帮助。


声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM