Unable to Read CSV File with Apache Commons CSV - IllegalArgumentException

Question

I'm attempting to access data from CSV file (which I downloaded from EBay's MIP server) using Apache Commons CSV however I'm encountering the following error:

java.lang.IllegalArgumentException: Index for header 'Selected Category ID' is 4 but CSVRecord only has 1 values!

I'm not quite sure the cause since the file clearly contains this index. My CSV file looks like this:

I'm using the following code to access file:

CSVParser csvParser = null;

    String selectedCategoryIDFieldName = "Selected Category ID";

    try {
        Reader reader = Files.newBufferedReader(Paths.get(CSVFile));
        csvParser = new CSVParser(reader, CSVFormat.DEFAULT
                .withHeader("SKU", "Locale", "Title", "Channel", selectedCategoryIDFieldName)
                .withIgnoreHeaderCase()
                .withTrim()
                .withSkipHeaderRecord(true));
    } catch (Exception e1) {
        // TODO Auto-generated catch block
        e1.printStackTrace();
    }

    if (csvParser != null) {
        List<CSVRecord> csvRecords = csvParser.getRecords();
        for (CSVRecord csvRecord : csvRecords) {
            // Accessing values by the names assigned to each column

            try {
                long currentRecordNumber = csvRecord.getRecordNumber();
                String SKU = csvRecord.get("SKU");
                String categoryID = csvRecord.get(selectedCategoryIDFieldName);
                // ^^ this line throws `IllegalArgumentException`


                System.out.println("Current record number: " + currentRecordNumber);
                System.out.println("SKU - >  " + SKU);
                System.out.println("categoyrID -> "  + categoryID);


            } catch (Exception e) {
                e.printStackTrace();
            }   
        }

I searched around SO and the closest question I found is here but it is not related to my question as the format is exactly the same before/after I save it (In other words, I don't see anything wrong with the format like in the other users question).

Update: I just realized this error surfaces on the second iteration of the for (CSVRecord csvRecord : csvRecords) loop (the file only contains one record). However, I still don't understand why it is iterating twice if there's only one record in CSV file.. and why does it only surface by category ID and not by SKU column?

Answer 1

There must be some empty spaces in record 2. Open the file in notepad or notepad++.

I am not familiar with apache commons csv so it may not be the optimum solution

for (CSVRecord csvRecord : csvRecords) {
   if(csvRecord.size() >= csvParser.getHeaderMap.size()){  <--- add this if condition

Answer 2

Maybe give univocity-parsers a go as it handles broken CSV pretty well (including dealing with unexpected spaces here and there) and it's also 3 times faster than commons-csv. It should also make your code cleaner as you don't have to put try/catch blocks everywhere.

CsvParserSettings settings = new CsvParserSettings();
settings.detectFormatAutomatically();
settings.setHeaders("SKU", "Locale", "Title", "Channel", selectedCategoryIDFieldName);
// settings.setHeaderExtractionEnabled(true); //use this if the headers are in the input

CsvParser parser = new CsvParser(settings);
List<Record> records = parser.parseAllRecords(new File("/path/to/your.csv"));

Hope this helps.

Disclaimer: I'm the author of this library. It's open source and free (Apache 2.0 license)

Answer 3

If trailing values of the row are empty, they may be omitted altogether with theirs delimiters causing size of the header to be bigger than size of the row values. This is pretty valid(?) CSV file. To adapt your parser use isSet() method:

row.isSet(column) ? row.get(column) : EMPTY

Unable to Read CSV File with Apache Commons CSV - IllegalArgumentException

Question

3 answers

solution1
1 ACCPTED 2018-03-13 20:44:39

solution2
0 2018-03-14 09:00:20

solution3
0 2021-01-25 18:04:49

Unable to Read CSV File with Apache Commons CSV - IllegalArgumentException

Question

3 answers

solution1 1 ACCPTED 2018-03-13 20:44:39

solution2 0 2018-03-14 09:00:20

solution3 0 2021-01-25 18:04:49

solution1
1 ACCPTED 2018-03-13 20:44:39

solution2
0 2018-03-14 09:00:20

solution3
0 2021-01-25 18:04:49