简体   繁体   中英

reading a csv file (which contains characters, integers and special symbols ) and extracting integers from it in java

Please help me out with this code... I'm opening a csv file, whose contents are given below , and I'm trying extract numbers from it, but it it is showing exceptions .... Please help

import java.io.*;
import java.util.*;

public class FDS2 
{
   public static void main(String[] args) throws IOException
    {
     ArrayList<String> al1 = new ArrayList<String>();
     ArrayList<Integer> al2= new ArrayList<>();

    try
    {
        BufferedReader finp = new BufferedReader(new FileReader("ex3.csv"));
        String str ;
        String strarr[];

        while((str=finp.readLine())!=null)
        {
            strarr = str.split(",") ;

             for(int i=0;i<strarr.length;i++)
             {
                 al1.add(strarr[i]);
             }


             for(int i=0;i<al1.size();i++)
             {
                 if (Character.isDigit(al1.get(i).charAt(0))==false)//||(al1.get(i)==null))
                 {
                     al1.remove(i);
                 }
                 else
                 {
                    System.out.println(al1.get(i)); 
                 }
             }

             for(int i=0;i<al1.size();i++)
             {
                 al2.add(Integer.parseInt(al1.get(i)));
                 //System.out.println(b.get(i));
             }

        }    
    }

    catch(IOException e)
        {
            System.out.println(e);     
        }

     System.out.println(al2);  

    }   
}

my csv file is like:

before,after,
100,109,
93,125,(Highly unexpected!)
106,104,(No change)
115,101,
if (Character.isDigit(al1.get(i).charAt(0))==false)//||(al1.get(i)==null))
{
    al1.remove(i);
}

Not judging the style of how you're trying to do this (I'd recommend using OpenCSV) but I believe your mistake is that you're removing elements by index, so the problem is that once you've removed the 0th element from the list, a1.size() will be 1 and your remove loop already terminates without removing all the text elements.

如果您接受不重新发明轮子的建议,我建议您使用一个名为BeanIO的库,该库BeanIO在多个项目(解析和验证)上成功使用。

I used to think using the split method when parsing CSV files was easiest and efficient. I highly suggest looking up Apache Commons CSV. It makes like a whole lot easier and allows you to do exactly what you're looking for

If what you need is so simple, I would do it directly without extra dependencies. If I correctly have understood what you are trying to do, with java8 would be something like this:

Stream<String> lines = Files.lines(Paths.get("/path/to/file.csv"));

List<Integer> result = 
    lines.flatMap(l -> Arrays.stream(l.split(",")))
    .filter(this::isDigit)
    .map(Integer::parseInt)
    .collect(toList());

(full code on https://gist.github.com/alacambra/b77d80e19c30c477bcb3 )

However, the problem on your code is not the csv but the fact that you are removing items from the array before to scan it completely.

for (int i = 0; i < al1.size(); i++) {
    if (Character.isDigit(al1.get(i).charAt(0)) == false)){

        al1.remove(i); <---- that's the error

    } else {
        System.out.println(al1.get(i));
    }        
}

Use uniVocity-parsers to do this for you. It's 3 times faster than apache commons and has many more features:

    String input = "before,after,\n" +
            "100,109,\n" +
            "93,125,(Highly unexpected!)\n" +
            "106,104,(No change)\n" +
            "115,101,";

    ObjectRowListProcessor rowProcessor = new ObjectRowListProcessor();
    rowProcessor.convertFields(Conversions.toInteger()).set("before", "after"); //converts the given columns to integer

    CsvParserSettings settings = new CsvParserSettings(); //many options here, check the tutorial
    settings.setRowProcessor(rowProcessor);
    settings.setHeaderExtractionEnabled(true); //we want to use the first row as the headers row

    settings.selectFields("after", "before"); // here I even switched the order of the fields

    //parse
    new CsvParser(settings).parse(new StringReader(input));

    //get the rows
    List<Object[]> rows = rowProcessor.getRows();
    for(Object[] row : rows){
        System.out.println(Arrays.toString(row));
    }

Output (with fields reordered):

[109, 100]
[125, 93]
[104, 106]
[101, 115]

If you remove the line with settings.selectFields , the output will be:

[100, 109, null]
[93, 125, (Highly unexpected!)]
[106, 104, (No change)]
[115, 101, null]

Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM