简体   繁体   中英

Apache commons csv skip lines

How to skip lines in input file with apache commons csv . In my file first few lines are garbage useful meta-information like date, etc. Can't find any options for this.

private void parse() throws Exception {
    Iterable<CSVRecord> records = CSVFormat.EXCEL
            .withQuote('"').withDelimiter(';').parse(new FileReader("example.csv"));
    for (CSVRecord csvRecord : records) {
        //do something            
    }
}

Use FileReader.readLine() before starting the for-loop .

Your example:

private void parse() throws Exception {
  FileReader reader = new FileReader("example.csv");
  reader.readLine(); // Read the first/current line.

  Iterable <CSVRecord> records = CSVFormat.EXCEL.withQuote('"').withDelimiter(';').parse(reader);
  for (CSVRecord csvRecord: records) {
    // do something
  }
}

There is no built-in facility to skip an unknown number of lines.

If you want to skip only the first line (the header line), you can call withSkipHeaderRecord() while building the parser.

A more general solution would be to call next() on the iterator:

Iterable<CSVRecord> parser = CSVFormat.DEFAULT.parse(new FileReader("example.csv"));
Iterator<CSVRecord> iterator = parser.iterator();

for (int i = 0; i < amountToSkip; i++) {
    if (iterator.hasNext()) {
        iterator.next();
    }
}

while (iterator.hasNext()) {
    CSVRecord record = iterator.next();
    System.out.println(record);
}

So CSVParser.iterator() should most definitely not throw an exception on iterator.hasNext() as it makes it near impossible to recover during an error condition.

But where there is a will there is a way, and I present a Terrible Idea that sorta works™

    public void runOnFile(Path file) {
        try {
            BufferedReader in = fixHeaders(file);
            CSVParser parsed = CSVFormat.DEFAULT.withFirstRecordAsHeader().parse(in);
            Map<String, Integer> headerMap = parsed.getHeaderMap();

            String line;
            while ((line = in.readLine()) != null) {
                try {
                    CSVRecord record = CSVFormat.DEFAULT.withHeader(headerMap.keySet().toArray(new String[headerMap.keySet().size()]))
                            .parse(new StringReader(line)).getRecords().get(0);
                    // do something with your record
                } catch (Exception e) {
                    System.out.println("ignoring line:" + line);
                }
            }
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

You can skip the header line using this

        Reader excelInput = new FileReader("example.csv");

        CSVFormat csvFormat = CSVFormat.EXCEL.withSkipHeaderRecord(true).withHeader("Arm1", "Arm2", "Arm3", "Arm4",
            "Arm5", "Arm6");

        CSVParser csvParser = new CSVParser(excelInput, csvFormat);

The key point is to set withSkipHeaderRecord() to true and also specify the headers that you want to skip inside withHeader() .

If you are aware of the line numbers you want to skip, you could do something like this:

for(CVSRecord csvRecord: CSVParser){
   if(csvRecord.getRecordNumber() == 1){
      continue;
  } 
} 

where line 1 is what you want to skip.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM