简体   繁体   English

读取大文件的最有效方法

[英]most efficient way to read huge file

I am not familiar with JAVA NIO APIs.我不熟悉 JAVA NIO API。 I need help to get the answer of commonly asked interview questions.我需要帮助才能获得常见面试问题的答案。 If there is file which contains 50 gb data, what is most efficient way that we can read data from file and find most frequent word.如果有包含 50 GB 数据的文件,我们可以从文件中读取数据并找到最常用的单词的最有效方法是什么。

BufferedReader.readLine() is better API than scanner . BufferedReader.readLine() 是比扫描仪更好的 API。 do we have any other way also apart from creating multiple threads to read this file in batches using BufferedReader.readLine() API ?除了使用 BufferedReader.readLine() API 创建多个线程来批量读取此文件之外,我们还有其他方法吗?

See java.nio.channels.FileChannel javadocs:请参阅 java.nio.channels.FileChannel javadocs:

A region of a file may be mapped directly into memory;文件的一个区域可以直接映射到内存中; for large files this is often much more efficient than invoking the usual read or write methods.对于大文件,这通常比调用通常的读取或写入方法更有效。

Perhaps, using the below class, you may achieve fastest way of taking/reading input:也许,使用下面的类,您可以获得获取/读取输入的最快方式:

 import java.io.DataInputStream; 
 import java.io.FileInputStream; 
 import java.io.IOException; 
 import java.io.InputStreamReader; 
 import java.util.Scanner; 
 import java.util.StringTokenizer; 

public class Main 
{ 
static class Reader 
{ 
    final private int BUFFER_SIZE = 1 << 16; 
    private DataInputStream din; 
    private byte[] buffer; 
    private int bufferPointer, bytesRead; 

    public Reader() 
    { 
        din = new DataInputStream(System.in); 
        buffer = new byte[BUFFER_SIZE]; 
        bufferPointer = bytesRead = 0; 
    } 

    public Reader(String file_name) throws IOException 
    { 
        din = new DataInputStream(new FileInputStream(file_name)); 
        buffer = new byte[BUFFER_SIZE]; 
        bufferPointer = bytesRead = 0; 
    } 

    public String readLine() throws IOException 
    { 
        byte[] buf = new byte[64]; // line length 
        int cnt = 0, c; 
        while ((c = read()) != -1) 
        { 
            if (c == '\n') 
                break; 
            buf[cnt++] = (byte) c; 
        } 
        return new String(buf, 0, cnt); 
    } 

    public int nextInt() throws IOException 
    { 
        int ret = 0; 
        byte c = read(); 
        while (c <= ' ') 
            c = read(); 
        boolean neg = (c == '-'); 
        if (neg) 
            c = read(); 
        do
        { 
            ret = ret * 10 + c - '0'; 
        } while ((c = read()) >= '0' && c <= '9'); 

        if (neg) 
            return -ret; 
        return ret; 
    } 

    public long nextLong() throws IOException 
    { 
        long ret = 0; 
        byte c = read(); 
        while (c <= ' ') 
            c = read(); 
        boolean neg = (c == '-'); 
        if (neg) 
            c = read(); 
        do { 
            ret = ret * 10 + c - '0'; 
        } 
        while ((c = read()) >= '0' && c <= '9'); 
        if (neg) 
            return -ret; 
        return ret; 
    } 

    public double nextDouble() throws IOException 
    { 
        double ret = 0, div = 1; 
        byte c = read(); 
        while (c <= ' ') 
            c = read(); 
        boolean neg = (c == '-'); 
        if (neg) 
            c = read(); 

        do { 
            ret = ret * 10 + c - '0'; 
        } 
        while ((c = read()) >= '0' && c <= '9'); 

        if (c == '.') 
        { 
            while ((c = read()) >= '0' && c <= '9') 
            { 
                ret += (c - '0') / (div *= 10); 
            } 
        } 

        if (neg) 
            return -ret; 
        return ret; 
    } 

    private void fillBuffer() throws IOException 
    { 
        bytesRead = din.read(buffer, bufferPointer = 0, BUFFER_SIZE); 
        if (bytesRead == -1) 
            buffer[0] = -1; 
    } 

    private byte read() throws IOException 
    { 
        if (bufferPointer == bytesRead) 
            fillBuffer(); 
        return buffer[bufferPointer++]; 
    } 

    public void close() throws IOException 
    { 
        if (din == null) 
            return; 
        din.close(); 
    } 
} 

public static void main(String[] args) throws IOException 
{ 
    Reader s=new Reader(); 
    int n = s.nextInt(); 
    int k = s.nextInt(); 
    int count=0; 
    while (n-- > 0) 
    { 
        int x = s.nextInt(); 
        if (x%k == 0) 
        count++; 
    } 
    System.out.println(count); 
} 
} 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM