简体   繁体   中英

Java - parse CSV into Hashmap and Arraylist based on columns

I would like to ask for help with this task: I have CSV for example like this:

column1$column2$column3
123$xyz$321
456$zyx$654

And I would like to parse it by Java, so I would have a Hashmap of Array lists for every column. For example:

["column1",[123,456]]
["column2",[xyz,zyx]]
["column3",[321,654]]

Thanks everyone.

Someone already gave me advice how to solve this task by array lists of array lists, but I would like to use the Hashmap, so I could have the index of the column. How could I edit this code?

public static void main(String[] args) {
    ArrayList<ArrayList<String>> columns = new ArrayList<ArrayList<String>>();
    BufferedReader br = null;
    try {
        String sCurrentLine;
        br = new BufferedReader(new FileReader("testing.cvs"));

        while ((sCurrentLine = br.readLine()) != null) {
            String[] fields = sCurrentLine.split("\\$");
            for (int i = 0; i < fields.length; i++) {
                if (columns.size()<=i){
                    columns.add(new ArrayList<String>());
                }
                columns.get(i).add(fields[i]);
            }
        }

    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        try {
            if (br != null)br.close();
        } catch (IOException ex) {
            ex.printStackTrace();
        }
    }
}

Thank everyone

Lets pass you some code while I'm at it....

In this solution I use two Maps one for the column name - values and one for the index - column name

On first row, I fill index the maps, then I start filling them...

Map<String, List<String>> mappedCols = new HashMap<String, List<String>>();
Map<Integer, String> indexCols = new HashMap<Integer, String>();

BufferedReader br = null;
try {
    String sCurrentLine;
    br = new BufferedReader(new FileReader("testing.cvs"));
    boolean firstLine = true;
    while ((sCurrentLine = br.readLine()) != null) {
        String[] fields = sCurrentLine.split("\\$");
        for (int i = 0; i < fields.length; i++) {
            if (firstLine) {
                indexCols.put(i, fields[i]);
                mappedCols.put(fields[i], new ArrayList<String>());
            } else {
                String colName = indexCols.get(i);
                if (colName == null) {
                    break;
                }
                mappedCols.get(colName).add(fields[i]);
            }
        }
        firstLine = false;
    }
} catch (IOException e) {
    e.printStackTrace();
} finally {
    try {
        if (br != null)
            br.close();
    } catch (IOException ex) {
        ex.printStackTrace();
    }
}

The result that you are after is in the mappedCols

Not tested - but should be pretty close -

 public static void main(String[] args) {
        Map<String, List<String>> columnsMap = new HashMap<>();
        BufferedReader br = null;
        try {
            String sCurrentLine;
            br = new BufferedReader(new FileReader("testing.cvs"));

            while ((sCurrentLine = br.readLine()) != null) {
                String[] fields = sCurrentLine.split("\\$");

                for (int i = 0; i < fields.length; i++) {
                    String columnKey = "column" + (i + 1);
                    List<String> columnValuesList = columnsMap.get(columnKey);
                    if (columnValuesList == null) {
                        columnValuesList = new ArrayList<>();
                    }

                    columnValuesList.add(fields[i]);
                    columnsMap.put(columnKey, columnValuesList);
                }
            }

        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (br != null)
                    br.close();
            } catch (IOException ex) {
                ex.printStackTrace();
            }
        }
    }

The following solutions makes use of columns HashMap> that takes a String as key (column name) and ArrayList as value. I have tested it with n number of columns and rows and work just fine.

public static void main(String[] args) {
    //hashmap with column as key and values as arraylist
    Map<String, List<String>> columns = new HashMap<String, List<String>>();

    //column names
    List<String> columnNames = new ArrayList<String>(); 

    BufferedReader br = null;
    try {
        String sCurrentLine;
        br = new BufferedReader(new FileReader("C:\\testing.csv"));

        //header flag
        boolean header = true; 

        while ((sCurrentLine = br.readLine()) != null) {
            String[] fields = sCurrentLine.split("\\$");

            for (int i = 0; i < fields.length; i++) {   
                if(header) {
                    //array list as null, since we don't have values yet
                    columns.put(fields[i], null); 

                    //also seperately store the column names to be used to as key to get the list
                    columnNames.add(fields[i]);

                    //reached end of the header line, set header to false 
                    if(i == (fields.length-1))
                        header = false;
                } else {
                    //retrieve the list using the column names
                    List<String> tempList = columns.get(columnNames.get(i)); 

                    if(tempList != null) {
                        //if not null then already element there
                        tempList.add(fields[i]); 
                    } else {
                        //if null then initialize the list
                        tempList = new ArrayList<String>(); 
                        tempList.add(fields[i]);
                    }
                    //add modified list back to the map
                    columns.put(columnNames.get(i), tempList); 
                }
            }
        }

    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        try {
            if (br != null)
                br.close();
        } catch (IOException ex) {
            ex.printStackTrace();
        }
    }
    System.out.println("columns: " + columns.toString());
    //columns: {column1=[123, 456], column3=[321, 654], column2=[xyz, zyx]}
}

Use univocity-parsers ' ColumnProcessor to do this for you reliably:

CsvParserSettings parserSettings = new CsvParserSettings();
parserSettings.getFormat().setLineSeparator("\n");
parserSettings.getFormat().setDelimiter('$');
parserSettings.setHeaderExtractionEnabled(true);

// To get the values of all columns, use a column processor
ColumnProcessor rowProcessor = new ColumnProcessor();
parserSettings.setRowProcessor(rowProcessor);

CsvParser parser = new CsvParser(parserSettings);

//This will kick in our column processor
parser.parse(new FileReader("testing.cvs"));

//Finally, we can get the column values:
Map<String, List<String>> columnValues = rowProcessor.getColumnValuesAsMapOfNames();

Clean, faster and easiers than doing it by hand. This will handle situations such as rows with different number of columns and add null to the list instead of throwing exceptions at you.

Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM