简体   繁体   中英

Quickly read in large amount of data

I am looking for a quick way to read in the roughly 150mb worth of spectroscopic data I have into a program I am writing. The data is currently stored in a text file (.dat) and its content is stored in a format like:

489.99992 490.000000.011780.01409

where the first N values represent x values and are separated by spaces and the last N values are y values separated by newline characters. (eg. x1= 489.99992, x2= 490.00000, y1=0.01178, y2=0.01409).

I wrote the following parser,

private void parse()
{
    FileReader reader = null;
    String currentNumber = "";
    int indexOfIntensity = 0;
    long startTime = System.currentTimeMillis();

    try 
    {
        reader = new FileReader(FILE);
        char[] chars = new char[65536];
        boolean waveNumMode = true;
        double valueAsDouble;

        //get buffer sized chunks of data from the file
        for(int len; (len = reader.read(chars)) > 0;)
        {
            //parse through the buffer
            for(int i = 0; i < len; i++)
            {                   
                //is a new number if true
                if((chars[i] == ' ' || chars[i] == '\n') && currentNumber != "")
                {
                    try 
                    {
                        valueAsDouble = Double.parseDouble(currentNumber);
                    }catch(NumberFormatException nfe)
                    {
                        System.out.println("Could not convert to double: " + currentNumber);
                        currentNumber = "";
                        continue;
                    }

                    if(waveNumMode) 
                    {
                        //System.out.println("Wavenumber: " + valueAsDouble);
                        listOfPoints.add(new Tuple(valueAsDouble));
                    }else
                    {
                        //System.out.println("Intensity: " + valueAsDouble);
                        listOfPoints.get(indexOfIntensity).setIntensityValue(valueAsDouble);
                        indexOfIntensity++;
                    }


                    if(chars[i] == '\n') 
                    {
                        waveNumMode = false;
                    }

                    currentNumber = ""; //clear for the next number
                    continue;
                }

                currentNumber += chars[i];
            }
        }

    } catch (IOException e) {
        e.printStackTrace();
    }

    try 
    {
        reader.close();
    } catch (IOException e) 
    {
        e.printStackTrace();
    }

    long stopTime = System.currentTimeMillis();
    System.out.println("Execution time: " + ((stopTime - startTime) / 1000.0) + " seconds");
}

but this takes around 50 seconds to finish for the 150mb file. For reference, we are using another piece of software which does this in roughly half a second (however it uses its own custom file type). I am willing to use a different file type or whatever really if it brings the execution time down. How can I speed this up?

Thanks in advance

In order to optimize code, you first need to find what parts of the code are slowing things down. Use a profiler to measure your code's performance and identify what parts are slowing down the process.

try reading all bytes from the file at once and then parse:

Files.readAllBytes(Paths.get(fileName))

as reader.read() operation is costly in Java.

You can also try surrounding your FileReader with BufferReader and then check if any performance gain.

For more info, visit the link:

https://www.geeksforgeeks.org/different-ways-reading-text-file-java/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM