简体   繁体   中英

Reading CSV file and storing data in Java

I have a task which requires me to read a CSV file in Java. I have done reading it but I think I do not store them in the way I wanted which enable me to access them on the later tasks such as analyzing some of the data, buidling a graph etc. The CSV file contains several variables in the header and some of the variables are in numbers and some are in alphabets, which mean I would need to store them in Integer or String format.

Do note that I did not use any library such as openCSV to read the file as I am a beginner and trying to get familiar with the basic Java.

Below is the nycflight13 which I read and store the data. The instruction given is to not take in any rows that contain the word "NA".

`    public class nycflights13 {

        public static void main(String[] args) {
            // TODO Auto-generated method stub
            List<Flights> NYC13 = readFileFromCSV("flights.csv");

            for(Flights a: NYC13) {
                System.out.println(a);
            }
        }

        public static List<Flights> readFileFromCSV (String fileName){
            List<Flights> flightData = new ArrayList <> (); 
            Path pathToFile = Paths.get(fileName);

            try(BufferedReader br = Files.newBufferedReader(pathToFile,
                    StandardCharsets.US_ASCII)){
                br.readLine();
                String line = br.readLine();

                while (line != null) {
                    String [] variable = line.split(",");

                    //convert string array to list
                    List<String> list = Arrays.asList(variable);
                    if(list.contains("NA")) { //Do not take in rows containing "NA"
                        break;
                    } else {
                        Flights dataset = createFlights(variable);
                        flightData.add(dataset);
                    }
                    line = br.readLine();
                }
            }catch (IOException ioe) {
                ioe.printStackTrace();
            }

            return flightData;
        }



        private static Flights createFlights (String [] metadata) {
            int year = Integer.parseInt(metadata[1]); //convert string into int
            int month = Integer.parseInt(metadata[2]); //convert string into int
            int day = Integer.parseInt(metadata[3]); //convert string into int
            int dep_time = Integer.parseInt(metadata[4]); //convert string into int
            String carrier = metadata[10];
            String flight = metadata[11];
            String origin = metadata[13];
            String dest = metadata[14]; 

            return new Flights(year, month, day, dep_time,carrier, flight, origin, dest);
        }

    }`

Below is my class Flights (I have way more variables than what I showed here):

class Flights {
        private int year; 
        private int month; 
        private int day; 
        private int dep_time;
        private String carrier; 
        private String flight; 
        private String origin;
        private String dest; 

        public Flights(int year, int month, int day, int dep_time, String carrier, String flight, String String origin, String dest) {
            this.year = year; 
            this.month = month; 
            this.day = day; 
            this.dep_time = dep_time;
            this.carrier = carrier; 
            this.flight = flight; 
            this.origin = origin; 
            this.dest = dest; 
        }

        public int getYear() {return year;}
        public void setYear(int year) {this.year = year;}

        public int getMonth() {return month;}
        public void setMonth(int month) {this.month = month; }

        public int getDay() {return day;}
        public void setDay(int day) {this.day = day; }

        public int getdep_time() {return dep_time;}
        public void setdep_time(int dep_time) {this.dep_time = dep_time; }

        ............
        .............
        ...........


        @Override
        public String toString() {
           return "Flights [year=" + year +", month=" + month +", day=" + day +", dep_time=" + 
               dep_time +
                ", carrier=" + carrier + ", flight=" + flight +", origin=" + origin +", dest=" + dest 
              +", air_time=" + air_time +", distance=" + distance +", 
                 hour=" + hour +", minute=" + minute +
                 ", time_hour=" + time_hour +"]";
`

The above code will give me result as below:

Flights [year=2013, month=1, day=1, dep_time=926, sched_dep_time=929, dep_delay=-3, arr_time=1404, sched_arr_time=1421, arr_delay=-17, carrier="B6", flight=215, tailnum="N775JB", origin="EWR", dest="SJU", air_time=191, distance=1608, hour=9, minute=29, time_hour=2013-01-01 09:00:00]

Flights [year=2013, month=1, day=1, dep_time=926, sched_dep_time=922, dep_delay=4, arr_time=1221, sched_arr_time=1219, arr_delay=2, carrier="B6", flight=57, tailnum="N534JB", origin="JFK", dest="PBI", air_time=151, distance=1028, hour=9, minute=22, time_hour=2013-01-01 09:00:00]

Flights [year=2013, month=1, day=1, dep_time=926, sched_dep_time=928, dep_delay=-2, arr_time=1233, sched_arr_time=1220, arr_delay=13, carrier="UA", flight=1597, tailnum="N27733", origin="EWR", dest="EGE", air_time=287, distance=1726, hour=9, minute=28, time_hour=2013-01-01 09:00:00]

Flights [year=2013, month=1, day=1, dep_time=927, sched_dep_time=930, dep_delay=-3, arr_time=1231, sched_arr_time=1257, arr_delay=-26, carrier="DL", flight=1335, tailnum="N951DL", origin="LGA", dest="RSW", air_time=166, distance=1080, hour=9, minute=30, time_hour=2013-01-01 09:00:00]

I have a few questions:

  1. My csv data actually contains more than 300k rows of data but with the code that I built as above, I only manage to print like 280 lines. Is it the code went wrong? or we have an upper limit in eclipse in printing lines.

  2. I would like to know how can I access to a particular variables from the List<Flights> such as carrier or month to calculate the total size of carrier or to count the frequency of the month.

  3. What is the correct ways to store data with multiple variables? and able to access them in another class. OR ways to improve my current code.

Appreciate for the feedback and the times. Thanks a million.

Answering your queries:

  1. If you did not get any error or exception while executing the code, you need not worry. Eclipse has default console buffer size which is limited. Refer - https://javarevisited.blogspot.com/2013/03/how-to-increase-console-buffer-size-in.html

  2. Now that you have read the data, you should go ahead and save it in a Database. Once you have the data in database you can run all sorts of query you want to get datas that satisfy your conditions.

  3. I did not understand what you mean by 'ways to store data with multiple variables'. Could you clarify?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM