For Comparing the rows in CSV file. Which is the best Data Structure to use

Question

Following are the few records of my csv file

Server1, Database, Oracle, 5.5
Server2, Database, Oracle, 6.2
Server3, OS, Ubuntu, 10.04
Server1, OS, Ubuntu, 10.04
Server2, OS, Ubuntu, 12.04
Server3, Language, Jav, 2.6.3

This file indicates that Server1, has version 5.5 of Oracle installed, and Server2 has version 6.2 installed, and Server3 has version 10.04 of Ubuntu installed.

Need to find out list of software package names for which an out-of-date version (ie a version which is not the latest version) is installed on at least 2 different servers. Thus, in this case, the output of program :

Ubuntu

I tried parse above csv file ArrayList, but finding difficult to process the further logic of the problem.

Can someone suggest what will be best Data Structure to be use in above problem? Also please provide some pointers to above problem.

Answer 1

Take a rough idea from a sample code

import java.io.File;
import java.io.FileNotFoundException;
import java.util.*;

public class Test {
    private static final int MAX_LIMIT = 2;

    public static void main(String[] args) throws Exception {
        ArrayList<Package> packages = new ArrayList<>();

//change the path name
        String path = "G:/data.csv";

        parseCSVFile(path, packages);


        updateProductVersion(packages);

    }

    private static void updateProductVersion(ArrayList<Package> packages) {
        HashMap<String, String> latestVersionOfProducts = new HashMap<>(packages.size());
        for (Package p : packages) {
            String currentProduct = p.product;
            String currentVCode = Package.computeVCode(p.version);

            if (!latestVersionOfProducts.containsKey(currentProduct)) {
                latestVersionOfProducts.put(currentProduct, p.version);
            } else {
                String setVersion = latestVersionOfProducts.get(currentProduct);
                if (currentVCode.compareTo(Package.computeVCode(setVersion)) > 0) {
                    latestVersionOfProducts.put(currentProduct, p.version);
                }
            }
        }
        showLatestVersionsOfProducts(latestVersionOfProducts);

        detectOutdatedSystems(packages, latestVersionOfProducts);
    }

    private static void detectOutdatedSystems(ArrayList<Package> packages, HashMap<String, String> latestVersionOfProducts) {
        Set<Map.Entry<String, String>> products = latestVersionOfProducts.entrySet();
        boolean allNew = true;
        for (Map.Entry<String, String> product : products) {
            String productName = product.getKey();
            String productVersion = product.getValue();

            ArrayList<Package> outdates = new ArrayList<>();
            for (Package p : packages) {
                if (p.product.equalsIgnoreCase(productName) && !p.version.equalsIgnoreCase(productVersion)) {
                    outdates.add(p);
                }
            }
            if (outdates.size() >= MAX_LIMIT) {
                displayOutdates(outdates, productName);
                allNew = false;
            }
        }
        if (allNew) {
            System.out.println("All systems upto date");
        }
    }

    private static void displayOutdates(ArrayList<Package> outdates, String productName) {
        System.out.println(outdates.size() + " systems using outdated version of " + productName);
        for (Package aPackage : outdates) {
            System.out.println(aPackage);
        }
        System.out.println("---------------");
    }

    private static void showLatestVersionsOfProducts(HashMap<String, String> latestVersionOfProducts) {
        System.out.println("-----------------------------------------");
        System.out.println("latest versions detected are");
        Set<Map.Entry<String, String>> entries = latestVersionOfProducts.entrySet();
        System.out.println("\nVersion\t\tProduct");
        for (Map.Entry<String, String> entry : entries) {
            System.out.format("%-7s\t\t%s\n", entry.getValue(), entry.getKey());
        }
        System.out.println("-----------------------------------------");
    }


    private static void parseCSVFile(String path, ArrayList<Package> packages) throws FileNotFoundException {
        Scanner scanner = new Scanner(new File(path));
        while (scanner.hasNext())
            packages.add(new Package(scanner.nextLine()));
    }


    static class Package {
        String machine;//Server
        String type;//Database or OS or other
        String product;//Oracle or other
        String version;//version number


        public Package(String line) {
            String[] contents = line.split(",");
            machine = contents[0].trim();
            type = contents[1].trim();
            product = contents[2].trim();
            version = contents[3].trim();
        }

        public static String computeVCode(String version) {
            return version.replace(".", "").replaceAll(" ", "").toLowerCase().trim();
        }

        @Override
        public String toString() {
            return product + ' ' + type + " version:" + version + " is installed on " + machine;
        }
    }
}

Answer 2

Which is the best Data Structure to use?

The answer could be subjective. I would recommend using List<String[]> . The List here is a list of lines in the file and the String array within is an array of words separated by comma.

Path filePath = new File("resources/file.csv").toPath();
List<String[]> info = new ArrayList<String[]>();

try{
        Files.lines(filePath).forEach(line -> info.add(line.split(",")));

        List<String[]> oldSoftware = info.stream().filter(line -> Integer.parseInt(line[3].trim().replaceAll("\\.", "")) < 
                        info.stream().filter(line2 -> line2[2].equalsIgnoreCase(line[2])).map(line3 -> Integer.parseInt(line3[3].trim().replaceAll("\\.", ""))).max(Integer::compare).get()
                        ).collect(Collectors.toList());
}
catch (IOException e) {
        System.out.println("Can't read the file");
}

Answer 3

Added following method from main method. Dont think this is fully efficient, but it is able to read csv file and pass many of test cases.

    private void findDuplicates(List<Inventory> inventoryList){

        Collections.sort(inventoryList, new SoftwareComparator());

        int size = inventoryList.size();
        int softwareCount=0;

        for(int i=0; i <size-1 ; i++){
            Inventory inv1 = inventoryList.get(i);
            Inventory inv2 = inventoryList.get(i+1);

            if(inv1.getSoftwareName().equals(inv2.getSoftwareName())){
                softwareCount++;
                if(inv1.getVersionNum().equals(inv2.getVersionNum()) || softwareCount==2 ){
                    if(!inv1.getServerName().equals(inv2.getServerName()) && softwareCount==2){
                        System.out.println(inv1.getSoftwareName() +"   "+ inv1.getVersionNum());
                    }
                }
            }else{
                softwareCount=0;
            }
        }

    }

class SoftwareComparator implements Comparator<Inventory>{

@Override
public int compare(Inventory obj1, Inventory obj2) {
    return obj1.getSoftwareName().compareTo(obj2.getSoftwareName());
}

}

For Comparing the rows in CSV file. Which is the best Data Structure to use

Question

3 answers

solution1
0 ACCPTED 2017-02-13 18:44:04

solution2
0 2017-02-13 19:17:42

solution3
0 2017-02-15 02:08:10

For Comparing the rows in CSV file. Which is the best Data Structure to use

Question

3 answers

solution1 0 ACCPTED 2017-02-13 18:44:04

solution2 0 2017-02-13 19:17:42

solution3 0 2017-02-15 02:08:10

solution1
0 ACCPTED 2017-02-13 18:44:04

solution2
0 2017-02-13 19:17:42

solution3
0 2017-02-15 02:08:10