简体   繁体   中英

Java: summing elements from multiple csv files

I just started learning java and I have a little project trying to learn through it java. One aspect of this project is to open data files (one by one), read the the elements in each column and sum these elements together from each file.

To explain it a bit more, let's say I would like to add the first element from the first file to first element from the second file and so on until the last file. I have 4 csv files, each file has 24 columns and each column has 1000 element.

Please accept my apologies if my question sounds very silly, but I have trying to do this for more than three days :'(

I hope one expert of you can help me to go around this obstacle!

All the best.

Here is part of the code I've created, but the problem with this code that it reads the entire column for each file while I only want to read element per file. The reason for that is because I want to do some data manipulation later, like taking the average or standard deviation (of those separated elements):

        //================================= Generate XY-data for calculations
                        static double[][][] node_Data(String filename, int colmn) throws IOException{
        // I skipped here the stuff which you don't need, not relevant.
                            node_data = new double [numberOfFiles][colmnLenght][numberOfColmns];
                            try {
                                scan = new Scanner (new BufferedReader(new FileReader(filename)));
                                scan.nextLine();

                                colmn_entries = 0;
                                for (int experiment = firstFile_index; experiment < lastFile_index; experiment++ ){
                                    while (scan.hasNext()){
                                        scanedData = scan.nextLine();
                                        String [] array=scanedData.split(","); 
                                        node_data[experiment][colmn_entries][colmn] = Double.parseDouble(array[colmn]);
                                        //System.out.println(node_data[experiment][colmn_entries][colmn]);
                                        colmn_entries++;
                                    }   
                                }
                            } catch (FileNotFoundException e) { e.printStackTrace(); }

                        return node_data;
                        }
                //-------------------------------------- End of XY-generator

And then I loop over the above function with the number of columns in a main() function which loads the file name (fetch that path basically and its index)

The input files should read as follows: file (1):

A, B, C, D, E, F, ...
1, 2, 3, 4, 5, 6, ...
1, 2, 3, 4, 5, 6, ...
1, 2, 3, 4, 5, 6, ...
1, 2, 3, 4, 5, 6, ...

file (2): 
A, B, C, D, E, F, ...
7, 8, 9, 10, 11, 12, ...
7, 8, 9, 10, 11, 12, ...
7, 8, 9, 10, 11, 12, ...
7, 8, 9, 10, 11, 12, ...

and so on until file n. The output should be stored somewhere (in array or a list or whatever can handle that) and reads as follows:

1, 2, 3, 4, 5, 6, ... (coming from file [1])
7, 8, 9, 10, 11, 12, ... (coming from file [2])
13, 14, 15, 16, 17, 18, ... (coming from file [3])
....
....

Eventually one should be able to work out the mean for all the element of each file individually, for example by simply summing the first (or any) column in the produced stored output array.

With this code, you should be able to read the first item and sum up to the sumALLCSV variable of 4 different CSV files.

    import java.io.BufferedReader;
    import java.io.FileReader;

    public class ReadCSV {
          public static int sumALLCSV=0;
          String [] arrayCSVnames = ["test.csv", "test2.csv", "test3.csv", "test4.csv"];
          public static void main(String[] args) throws Exception {
              String splitBy = ","; // could be ";"

              for (int i = 0, i<arrayCSVnames.length, i++)  {                

                  BufferedReader br = new BufferedReader(new FileReader(arrayCSVnames[i]));
                  String line = br.readLine();
                  String[] b = line.split(splitBy);

                  //b here is your first element from your CSV.
                  System.out.println(b[0]);
                  // adding to the variable (below)
                  sumALLCSV += Integer.parseInt(b[0]);     
                  br.close();
              }
         }
   }

Ok guys, I managed to fix my code and here is how I fixed it, maybe it will help someone else. The while loop was actually causing all of the troubles and I had to replace it with a for loop

package your_package_goes_here;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.LineNumberReader;
import java.util.Scanner;

public class data_vault { // start of the class "{"

    ///private static int elementSize;
    static double[][][] data;
    static int data_size;


    static void data_generator(int firstFile, int lastFile, int firstColmn, int lastColmn) throws NumberFormatException, IOException{

            for (int colmn = firstColm; colmn < lastColmn; colmn++ ){
                //System.out.println("------ column: " + colmn); // just to follow up with the ordering, not needed at all.
                for (int index = firstFile; index < lastFile; index++){
                fetch_data(index, colmn);
                data_manipulate.data_mean(index, colmn); 
                // You can basically do whatever you want with this data now, 
               //here I'm just taking the mean as a simple example (implemented in a different class "data_manipulate")
                }
            }
    }

The function below collect the data from each file and store it in a 3D array double [][][] data . Note that the file are labeled as int index here, explained below in the file_call() function.

//================================= Generate data for calculations
public static double[][][] fetch_data(int index, int node) throws IOException{

//for (int index = start; index < end; index++){
    int length = file_length(index);
    data = new double [number_of_files][length_of_file][number_of_columns]; // This should be the size of your 3d array.
    for (int i = 0; i < length; i++){
        Scanner scan = new Scanner (new BufferedReader(new FileReader(file_call(index))));
        scan.nextLine();
        if (scan.hasNext()){
            String scanedData = scan.nextLine();
            String [] array = scanedData.split(",");
            data[index][i][node] = Double.parseDouble(array[node]);
            //System.out.println("node: " + node + ", entry: " + data[index][i][node]);
            //System.out.println("entry: " + data[index][i][node]);
            }
        }
    //System.out.print("file: " + file_call(index));
    //}
    return data;
}
//-------------------------------------------------------

Because in my project the files are indexed, such that their names appear in the path as below, I had to create this function which can be called in the main loop later and run over the files index:

../path/to/file/data_file_parameter_setting_1.csv
../path/to/file/data_file_parameter_setting_2.csv
...
../path/to/file/data_file_parameter_setting_n.csv

And here is what I call the file_call(int index) , sorry if I pick awkward names for my functions.

//================================= Call the loaded data file and issue it a name
public static String file_call(int index) throws IOException{

    String name = Analytics.exprName.getText();
    String parameter = String.valueOf(Analytics.paramType_1.getSelectedItem());
    String setting =  String.valueOf(Analytics.typeSet);
    String filename = reading_data.locate_file(name, parameter, setting, index);
    //System.out.println("File: " + filename); // just following up here too, no need to print.

    return filename;
}

And this function below will just determine the length of the file, which in principle is length of each column. I need to determine such parameter so as I can iterate over all entries.

//================================= Determine file length
public static int file_length(int index) throws IOException{

    LineNumberReader  lnr = new LineNumberReader(new FileReader(new File(file_call(index))));
    try { 
        lnr.skip(Long.MAX_VALUE); 
        } 
        catch (IOException e1) { 
            e1.printStackTrace(); 
            }
    lnr.close();
    data_size = lnr.getLineNumber()-1;
    //System.out.println(elementSize);

    return data_size;
}
    //-------------------------------------------------------
} // end of the class "}"

I hope that someone will find this solution useful and thumb up if you like it ;) Also thank you to those who commented in my post, appreciated.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM