简体   繁体   English

使用 java 中的 indexOutOfBound 解析 csv 文件

[英]Parsing csv file with indexOutOfBound in java

I am new to Java and practice parsing csv file.我是 Java 的新手,并练习解析 csv 文件。 I do understand what does IndexOutOfBound means, but I don't understand why my parsed data cannot do like all the tutorials I've visited such as https://examples.javacodegeeks.com/java-csv-parsing-example/我明白IndexOutOfBound是什么意思,但我不明白为什么我的解析数据不能像我访问过的所有教程一样,例如https://examples.javacodegeeks.com/java-csv-parsing-example/

I can only read the first column, which is data[0] .我只能读取第一列,即data[0] There must be something wrong with my parsing method, but I cannot figure it out.我的解析方法一定有问题,但我无法弄清楚。 Any help or hint is hight appreciated.任何帮助或提示都非常感谢。

my data file looks like this:我的数据文件如下所示:

  [0],    [1], [2],    [3]  ,    [4]    ,   [5]   ,  [6] ,   [7]  ,  [8] , [9]
class, gender, age, bodyType, profession, pregnant, isYou ,species, isPet, role
scenario:green,   ,         ,           ,         ,        ,      ,      ,
person, female, 24, average , doctor    , FALSE   ,        ,      ,      , passenger
animal, male  ,  4,         ,           , FALSE   ,        , dog  , TRUE , pedestrian
  .
  .

I tried to parse like this:我试图这样解析:

ArrayList<String> csvContents = new ArrayList<String>();    

try (BufferedReader csvReader = new BufferedReader(new FileReader(csvFile));) {
        String headerLine = csvReader.readLine(); //get rid of header

        while ((line = csvReader.readLine()) != null) { 
            csvContents.add(line);// add the line to the ArrayList      
        }

for (String csvLine : csvContents) {

            // split by comma and remove redundant spaces
            String[] data = csvLine.split("\\s*,\\s*"); 
            System.out.println(data[1]);// IndexOutOfBound

            Character character = null;
            String clazz = data[0].toLowerCase();// cannot use word "class" as a variable

            Profession professionEnum = Profession.valueOf(data[4].toUpperCase());  
            Gender genderEnum = Gender.valueOf(data[1].toUpperCase());
            BodyType bodyTypeEnum =BodyType.valueOf(data[3].toUpperCase());

            if (clazz.startsWith("scenario")) { 
                scenario = new Scenario();
                scenario.setLegalCrossing(clazz.endsWith("green"));
                continue;
            } else if ("person".equals(clazz)) {

                person = new Person(Integer.parseInt(data[2]), professionEnum ,genderEnum , bodyTypeEnum , Boolean.parseBoolean(data[5]));
                person.setAsYou(Boolean.parseBoolean(data[6]));
            } else if ("animal".equals(clazz)) {
                animal = new Animal(Integer.parseInt(data[2]) , genderEnum , bodyTypeEnum, data[7]);
                animal.setIsPet(Boolean.parseBoolean(data[8]));
            }
} catch (someException e) {
      e.printStackTrace();
}

EDIT编辑

print out csvLine before split:在拆分前打印出csvLine

scenario:green,,,,,,,,,
person,female,24,average,doctor,false,false,,,passenger
person,male,40,overweight,unknown,false,false,,,passenger
person,female,2,average,,false,false,,,passenger
person,male,82,average,,false,false,,,pedestrian
person,female,32,average,ceo,true,false,,,pedestrian
person,male,7,athletic,,false,false,,,pedestrian
animal,male,4,,,false,false,dog,true,pedestrian
scenario:red,,,,,,,,,

在此处输入图像描述

After spliting, the data just have one element, so that when you access data[1], then you get exception.拆分后,数据只有一个元素,所以当你访问数据[1]时,就会出现异常。 Solution: try with another regex like "," only.解决方案:仅尝试使用另一个正则表达式,例如“,”。

Ps: your csv is malformed at Ps:您的 csv 格式错误

scenario:green, , , , , , , , Try to put one more ","场景:绿色, , , , , , , , 尝试多放一个","

you need to fill comlete data for all cells in a row.您需要为一行中的所有单元格填写完整的数据。

For eg.例如。 first line in your csv, contains only 1 cell having value scenario:green which is data[0]. csv 中的第一行,仅包含一个具有值场景的单元格:绿色,即数据 [0]。

If you fill in data for all other cells in your csv, your will start receiving data[1], data[2], data[3]....如果您为 csv 中的所有其他单元格填写数据,您将开始接收 data[1]、data[2]、data[3]....

I've figured it out.我已经想通了。 It's counterintuitive for me, though.不过,这对我来说是违反直觉的。 I need to specify the length of the data array parsed to put every attribute like this:我需要指定解析的data数组的length ,以便像这样放置每个属性:

ArrayList<String> csvContents = new ArrayList<String>();    

try (BufferedReader csvReader = new BufferedReader(new FileReader(csvFile));) {
    String headerLine = csvReader.readLine(); //get rid of header

    while ((line = csvReader.readLine()) != null) { 
        csvContents.add(line);// add the line to the ArrayList      
    }

for (String csvLine : csvContents) {

        // split by comma and remove redundant spaces
        String[] data = csvLine.split("\\s*,\\s*"); 
        System.out.println(data[1]);// IndexOutOfBound

        Character character = null;
        String clazz = data[0].toLowerCase();// cannot use word "class" as a variable


        if (clazz.startsWith("scenario"&& data.length == 1)) { 
            scenario = new Scenario();
            scenario.setLegalCrossing(clazz.endsWith("green"));
            continue;
        } else if ("person".equals(clazz)&& data.length == 10) {
            Profession professionEnum = Profession.valueOf(data[4].toUpperCase());  
            Gender genderEnum = Gender.valueOf(data[1].toUpperCase());
            BodyType bodyTypeEnum =BodyType.valueOf(data[3].toUpperCase());
            person = new Person(Integer.parseInt(data[2]), professionEnum ,genderEnum , bodyTypeEnum , Boolean.parseBoolean(data[5]));
            person.setAsYou(Boolean.parseBoolean(data[6]));
        } else if ("animal".equals(clazz)) {
            animal = new Animal(Integer.parseInt(data[2]) , genderEnum , bodyTypeEnum, data[7]);
            animal.setIsPet(Boolean.parseBoolean(data[8]));
        }
} catch (someException e) {
  e.printStackTrace();
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM