简体   繁体   中英

How can I skip the first line of a csv in Java?

I want to skip the first line and use the second as header.

I am using classes from apache commons csv to process a CSV file.

The header of the CSV file is in the second row, not the first (which contains coordinates).

My code looks like this:

static void processFile(final File file) {
    FileReader filereader = new FileReader(file);
    final CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';');
    CSVParser parser = new CSVParser(filereader, format);
    final List<CSVRecord> records = parser.getRecords();
    //stuff
}

I naively thought,

CSVFormat format = CSVFormat.DEFAULT.withFirstRecordAsHeader().withDelimiter(;)

would solve the problem, as it's different from withFirstRowAsHeader and I thought it would detect that the first row doesn't contain any semicolons and is not a record. It doesn't. I tried to skip the first line (that CSVFormat seems to think is the header) with

CSVFormat format = CSVFormat.DEFAULT.withSkipHeaderRecord().withFirstRecordAsHeader().withDelimiter(;);

but that also doesn't work. What can I do? What's the difference between withFirstRowAsHeader and withFirstRecordAsHeader?

The correct way to skip the first line if it is a header is by using a different CSVFormat

CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';').withFirstRecordAsHeader();

Update: June 30 2022

For 1.9+, use

CSVFormat.DEFAULT.builder()                                                                  
    .setDelimiter(';')
    .setSkipHeaderRecord(true)  // skip header
    .build();

You may want to read the first line, before passing the reader to the CSVParser :

static void processFile(final File file) {
    FileReader filereader = new FileReader(file);
    BufferedReader bufferedReader = new BufferedReader(filereader);
    bufferedReader.readLine();// try-catch omitted
    final CSVFormat format = CSVFormat.DEFAULT.withDelimiter(';');
    CSVParser parser = new CSVParser(bufferedReader, format);
    final List<CSVRecord> records = parser.getRecords();
    //stuff
}

In version 1.9.0 of org.apache.commons:commons-csv use:

val format = CSVFormat.Builder.create(CSVFormat.DEFAULT)
        .setHeader()
        .setSkipHeaderRecord(true)
        .build()

val parser = CSVParser.parse(reader, format)

您可以使用流跳过第一条记录:

List<CSVRecord> noHeadersLine = records.stream.skip(1).collect(toList());

You can filter it using Java Streams:

parser.getRecords().stream()
     .filter(record -> record.getRecordNumber() != 1) 
     .collect(Collectors.toList());

I am assuming your file format looks something like:

<garbage line here>
<header data>
<record data starts here>

For version 1.9.0, use, as given above, but with one addition:

Reader in = new FileReader(fileName);
BufferedReader bufferedReader = new BufferedReader(in);
System.out.println(bufferedReader.readLine());
CSVFormat format = CSVFormat.Builder.create(CSVFormat.DEFAULT)
            .setHeader()
            .setSkipHeaderRecord(true)
            .build();
CSVParser parser = CSVParser.parse(bufferedReader, format);
for (CSVRecord record : parser.getRecords()) {
    <do something>
}

If you don't skip that first line somehow, you will throw an IllegalArgumentException.

You could consume the first line and then pass it to the CSVParser. Other than that there is a method #withIgnoreEmptyLines which might solve the issue.

the.setHeader() method must be call for the.setSkipHeaderRecord(true) to take effect.

CSVFormat.DEFAULT.builder()                                                                  
    .setDelimiter(';')
    .setHeader()    
    .setSkipHeaderRecord(true)  // skip header
    .build();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM