简体   繁体   English

如何在Java中有效读取大型文本文件

[英]How to read the large text files efficiently in java

Here, I am reading the 18 MB file and store it in a two dimensional array. 在这里,我正在读取18 MB的文件并将其存储在二维数组中。 But this program takes almost 15 minutes to run. 但是,该程序几乎需要15分钟才能运行。 Is there anyway to optimize the running time of the program. 无论如何,有没有优化程序的运行时间。 The file contains only binary values. 该文件仅包含二进制值。 Thanks in advance… 提前致谢…

public class test 
{
    public static void main(String[] args) throws FileNotFoundException, IOException 
    {
        BufferedReader br;

        FileReader fr=null;
        int m = 2160;
        int n = 4320;
        int[][] lof = new int[n][m];
        String filename = "D:/New Folder/ETOPOCHAR";
       try {
         Scanner input = new Scanner(new File("D:/New Folder/ETOPOCHAR"));
        double range_km=1.0;
        double alonn=-57.07; //180 to 180
        double alat=38.53;

        while (input.hasNextLine()) {
            for (int i = 0; i < m; i++) {
                for (int j = 0; j < n; j++) {
                   try
                   {
                      lof[j][i] = input.nextInt();
                      System.out.println("value[" + j + "][" + i + "] = "+ lof[j][i]);
                    }
                   catch (java.util.NoSuchElementException e) {
                      //  e.printStackTrace();
                    }
                }
            }         //print the input matrix
        }

I have also tried with byte array but i can not save it in twoD array... 我也尝试过字节数组,但我无法将其保存在twoD数组中...

public class FileToArrayOfBytes
{
    public static void main( String[] args )
    {
        FileInputStream fileInputStream=null;

        File file = new File("name of file");

        byte[] bFile = new byte[(int) file.length()];

        try {
            //convert file into array of bytes
        fileInputStream = new FileInputStream(file);
        fileInputStream.read(bFile);
        fileInputStream.close();

        for (int i = 0; i < bFile.length; i++) {
            System.out.print((char)bFile[i]);
            }

        System.out.println("Done");
        }catch(Exception e){
            e.printStackTrace();
        }
    }
}

You can read the file into a byte array first, then deserialize these bytes. 您可以先将文件读入字节数组,然后反序列化这些字节。 Start with 2048 bytes buffer (as input buffer), then experiment by increasing/decreasing its size, but the experimental buffer size values should be a power of two (512, 1024, 2048, etc). 从2048字节缓冲区(作为输入缓冲区)开始,然后通过增大/减小其大小进行实验,但实验缓冲区的大小值应为2的幂(512、1024、2048等)。

As far as I rememenber, there are good chances that the best performance can be achived with a buffer of size 2048 bytes, but it is OS dependent and should be verified. 据我所知,很有可能使用2048字节大小的缓冲区来获得最佳性能,但是它取决于操作系统,应该进行验证。

Code sample (here you can try different values of BUFFER_SIZE variable, in my case I've read a test file of size 7.5M in less then one second): 代码示例(在这里您可以尝试使用BUFFER_SIZE变量的不同值,在我的情况下,我在不到一秒钟的时间内读取了7.5M大小的测试文件):

public static void main(String... args) throws IOException {
    File f = new File(args[0]);
    byte[] buffer = new byte[BUFFER_SIZE];
    ByteBuffer result = ByteBuffer.allocateDirect((int) f.length());
    try (FileInputStream fos = new FileInputStream(f)) {
      int bytesRead;
      int totalBytesRead = 0;
      while ((bytesRead = fos.read(buffer, 0, BUFFER_SIZE)) != -1) {
        result.put(buffer, 0, bytesRead);
        totalBytesRead += bytesRead;
      }
      // debug info
      System.out.printf("Read %d bytes\n", totalBytesRead);

      // Here you can do whatever you want with the result, including creation of a 2D array...
      int pos = result.position();
      result.rewind();
      for (int i = 0; i < pos / 4; i++) {
        System.out.println(result.getInt());
      }
    }
  }

Take your time and read docs for java.io, java.nio packages as well as Scanner class, just to improve understanding. 花点时间阅读Java.io,java.nio软件包以及Scanner类的文档,只是为了增进理解。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM