简体   繁体   中英

Unable to Read CSV File with Apache Commons CSV - IllegalArgumentException

I'm attempting to access data from CSV file (which I downloaded from EBay's MIP server) using Apache Commons CSV however I'm encountering the following error:

java.lang.IllegalArgumentException: Index for header 'Selected Category ID' is 4 but CSVRecord only has 1 values!

I'm not quite sure the cause since the file clearly contains this index. My CSV file looks like this:

在此处输入图片说明

I'm using the following code to access file:

CSVParser csvParser = null;

    String selectedCategoryIDFieldName = "Selected Category ID";

    try {
        Reader reader = Files.newBufferedReader(Paths.get(CSVFile));
        csvParser = new CSVParser(reader, CSVFormat.DEFAULT
                .withHeader("SKU", "Locale", "Title", "Channel", selectedCategoryIDFieldName)
                .withIgnoreHeaderCase()
                .withTrim()
                .withSkipHeaderRecord(true));
    } catch (Exception e1) {
        // TODO Auto-generated catch block
        e1.printStackTrace();
    }

    if (csvParser != null) {
        List<CSVRecord> csvRecords = csvParser.getRecords();
        for (CSVRecord csvRecord : csvRecords) {
            // Accessing values by the names assigned to each column

            try {
                long currentRecordNumber = csvRecord.getRecordNumber();
                String SKU = csvRecord.get("SKU");
                String categoryID = csvRecord.get(selectedCategoryIDFieldName);
                // ^^ this line throws `IllegalArgumentException`


                System.out.println("Current record number: " + currentRecordNumber);
                System.out.println("SKU - >  " + SKU);
                System.out.println("categoyrID -> "  + categoryID);


            } catch (Exception e) {
                e.printStackTrace();
            }   
        }

I searched around SO and the closest question I found is here but it is not related to my question as the format is exactly the same before/after I save it (In other words, I don't see anything wrong with the format like in the other users question).

Update: I just realized this error surfaces on the second iteration of the for (CSVRecord csvRecord : csvRecords) loop (the file only contains one record). However, I still don't understand why it is iterating twice if there's only one record in CSV file.. and why does it only surface by category ID and not by SKU column?

There must be some empty spaces in record 2. Open the file in notepad or notepad++.

I am not familiar with apache commons csv so it may not be the optimum solution

for (CSVRecord csvRecord : csvRecords) {
   if(csvRecord.size() >= csvParser.getHeaderMap.size()){  <--- add this if condition

Maybe give univocity-parsers a go as it handles broken CSV pretty well (including dealing with unexpected spaces here and there) and it's also 3 times faster than commons-csv. It should also make your code cleaner as you don't have to put try/catch blocks everywhere.

CsvParserSettings settings = new CsvParserSettings();
settings.detectFormatAutomatically();
settings.setHeaders("SKU", "Locale", "Title", "Channel", selectedCategoryIDFieldName);
// settings.setHeaderExtractionEnabled(true); //use this if the headers are in the input

CsvParser parser = new CsvParser(settings);
List<Record> records = parser.parseAllRecords(new File("/path/to/your.csv"));

Hope this helps.

Disclaimer: I'm the author of this library. It's open source and free (Apache 2.0 license)

If trailing values of the row are empty, they may be omitted altogether with theirs delimiters causing size of the header to be bigger than size of the row values. This is pretty valid(?) CSV file. To adapt your parser use isSet() method:

row.isSet(column) ? row.get(column) : EMPTY

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM