简体   繁体   中英

Reading CSVs from a zip file a line at a time

I've got a Spring MVC app with a file upload capability. Files are passed to the controller as MultipartFile from which it's easy to get an InputStream. I'm uploading zip files that contain CSVs and I'm struggling to find a way to open the CSVs and read them a line at a time. There are plenty of examples on the 'net of reading into a fixed sizes buffer. I've tried this, but the buffers don't concatenate very well and it soon gets out of sync and uses a lot of memory:

        ZipEntry entry = input.getNextEntry();

        while(entry != null)
        {
            if (entry.getName().matches("Data/CSV/[a-z]{0,1}[a-z]{0,1}.csv"))
            {
                final String fullPath = entry.getName();
                final String filename = fullPath.substring(fullPath.lastIndexOf('/') + 1);

                visitor.startFile(filename);                    

                final StringBuilder fileContent = new StringBuilder();

                final byte[] buffer = new byte[1024];                   

                while (input.read(buffer) > 0)
                    fileContent.append(new String(buffer));

                final String[] lines = fileContent.toString().split("\n");  

                for(String line : lines)
                {
                    final String[] columns = line.split(",");
                    final String postcode = columns[0].replace(" ", "").replace("\"", "");

                    if (columns.length > 3)
                        visitor.location(postcode, "", "");
                }   

                visitor.endFile();                  
            }

            entry = input.getNextEntry();
        }

There must be a better way that actually works.

Not clear if this suits your need, but have you tried opencsv ( http://opencsv.sourceforge.net )? Their example is really intuitive:

CSVReader reader = new CSVReader(new FileReader("yourfile.csv"));
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
    // nextLine[] is an array of values from the line
    System.out.println(nextLine[0] + nextLine[1] + "etc...");
}

For your case, all you will need is to wrap the zipped file stream into a buffered reader and pass the reader to create a CSVReader and use it:

FileInputStream fis = new FileInputStream(file);
GZIPInputStream gis = new GZIPInputStream(fis);
InputStreamReader isr = new InputStreamReader(gis);
BufferedReader br = new BufferedReader(isr);
CSVReader reader = new CSVReader(br);

You could use a BufferedReader which includes the convenient readLine() method and wont load the entire contents of the file into memory eg

BufferedReader in = new BufferedReader(new InputStreamReader(input), 1024);
String line=null;
while((line=br.readLine())!=null) {
   String[] columns = line.split(",");
   //rest of your code
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM