用于比较 CSV 文件中的行。哪种数据结构最好用

Question

以下是我的 csv 文件的几条记录

Server1, Database, Oracle, 5.5
Server2, Database, Oracle, 6.2
Server3, OS, Ubuntu, 10.04
Server1, OS, Ubuntu, 10.04
Server2, OS, Ubuntu, 12.04
Server3, Language, Jav, 2.6.3

该文件表示Server1 安装了5.5 版的Oracle，Server2 安装了6.2 版，Server3 安装了Ubuntu 10.04 版。

需要找出安装在至少 2 个不同服务器上的过时版本（即不是最新版本的版本）的软件包名称列表。 因此，在这种情况下，程序的输出：

Ubuntu

我尝试解析上面的 csv 文件 ArrayList，但发现很难处理问题的进一步逻辑。

有人可以建议在上述问题中使用什么是最好的数据结构吗？ 还请提供一些指向上述问题的指针。

Answer 1

从示例代码中获取一个粗略的想法

import java.io.File;
import java.io.FileNotFoundException;
import java.util.*;

public class Test {
    private static final int MAX_LIMIT = 2;

    public static void main(String[] args) throws Exception {
        ArrayList<Package> packages = new ArrayList<>();

//change the path name
        String path = "G:/data.csv";

        parseCSVFile(path, packages);


        updateProductVersion(packages);

    }

    private static void updateProductVersion(ArrayList<Package> packages) {
        HashMap<String, String> latestVersionOfProducts = new HashMap<>(packages.size());
        for (Package p : packages) {
            String currentProduct = p.product;
            String currentVCode = Package.computeVCode(p.version);

            if (!latestVersionOfProducts.containsKey(currentProduct)) {
                latestVersionOfProducts.put(currentProduct, p.version);
            } else {
                String setVersion = latestVersionOfProducts.get(currentProduct);
                if (currentVCode.compareTo(Package.computeVCode(setVersion)) > 0) {
                    latestVersionOfProducts.put(currentProduct, p.version);
                }
            }
        }
        showLatestVersionsOfProducts(latestVersionOfProducts);

        detectOutdatedSystems(packages, latestVersionOfProducts);
    }

    private static void detectOutdatedSystems(ArrayList<Package> packages, HashMap<String, String> latestVersionOfProducts) {
        Set<Map.Entry<String, String>> products = latestVersionOfProducts.entrySet();
        boolean allNew = true;
        for (Map.Entry<String, String> product : products) {
            String productName = product.getKey();
            String productVersion = product.getValue();

            ArrayList<Package> outdates = new ArrayList<>();
            for (Package p : packages) {
                if (p.product.equalsIgnoreCase(productName) && !p.version.equalsIgnoreCase(productVersion)) {
                    outdates.add(p);
                }
            }
            if (outdates.size() >= MAX_LIMIT) {
                displayOutdates(outdates, productName);
                allNew = false;
            }
        }
        if (allNew) {
            System.out.println("All systems upto date");
        }
    }

    private static void displayOutdates(ArrayList<Package> outdates, String productName) {
        System.out.println(outdates.size() + " systems using outdated version of " + productName);
        for (Package aPackage : outdates) {
            System.out.println(aPackage);
        }
        System.out.println("---------------");
    }

    private static void showLatestVersionsOfProducts(HashMap<String, String> latestVersionOfProducts) {
        System.out.println("-----------------------------------------");
        System.out.println("latest versions detected are");
        Set<Map.Entry<String, String>> entries = latestVersionOfProducts.entrySet();
        System.out.println("\nVersion\t\tProduct");
        for (Map.Entry<String, String> entry : entries) {
            System.out.format("%-7s\t\t%s\n", entry.getValue(), entry.getKey());
        }
        System.out.println("-----------------------------------------");
    }


    private static void parseCSVFile(String path, ArrayList<Package> packages) throws FileNotFoundException {
        Scanner scanner = new Scanner(new File(path));
        while (scanner.hasNext())
            packages.add(new Package(scanner.nextLine()));
    }


    static class Package {
        String machine;//Server
        String type;//Database or OS or other
        String product;//Oracle or other
        String version;//version number


        public Package(String line) {
            String[] contents = line.split(",");
            machine = contents[0].trim();
            type = contents[1].trim();
            product = contents[2].trim();
            version = contents[3].trim();
        }

        public static String computeVCode(String version) {
            return version.replace(".", "").replaceAll(" ", "").toLowerCase().trim();
        }

        @Override
        public String toString() {
            return product + ' ' + type + " version:" + version + " is installed on " + machine;
        }
    }
}

Answer 2

哪种数据结构最好用？

答案可能是主观的。 我建议使用List<String[]> 。 这里的List是文件中的行列表，其中的String array是由逗号分隔的单词数组。

Path filePath = new File("resources/file.csv").toPath();
List<String[]> info = new ArrayList<String[]>();

try{
        Files.lines(filePath).forEach(line -> info.add(line.split(",")));

        List<String[]> oldSoftware = info.stream().filter(line -> Integer.parseInt(line[3].trim().replaceAll("\\.", "")) < 
                        info.stream().filter(line2 -> line2[2].equalsIgnoreCase(line[2])).map(line3 -> Integer.parseInt(line3[3].trim().replaceAll("\\.", ""))).max(Integer::compare).get()
                        ).collect(Collectors.toList());
}
catch (IOException e) {
        System.out.println("Can't read the file");
}

Answer 3

从 main 方法中添加了以下方法。 不要认为这是完全有效的，但它能够读取 csv 文件并通过许多测试用例。

    private void findDuplicates(List<Inventory> inventoryList){

        Collections.sort(inventoryList, new SoftwareComparator());

        int size = inventoryList.size();
        int softwareCount=0;

        for(int i=0; i <size-1 ; i++){
            Inventory inv1 = inventoryList.get(i);
            Inventory inv2 = inventoryList.get(i+1);

            if(inv1.getSoftwareName().equals(inv2.getSoftwareName())){
                softwareCount++;
                if(inv1.getVersionNum().equals(inv2.getVersionNum()) || softwareCount==2 ){
                    if(!inv1.getServerName().equals(inv2.getServerName()) && softwareCount==2){
                        System.out.println(inv1.getSoftwareName() +"   "+ inv1.getVersionNum());
                    }
                }
            }else{
                softwareCount=0;
            }
        }

    }

class SoftwareComparator implements Comparator<Inventory>{

@Override
public int compare(Inventory obj1, Inventory obj2) {
    return obj1.getSoftwareName().compareTo(obj2.getSoftwareName());
}

}

用于比较 CSV 文件中的行。哪种数据结构最好用

问题描述

3 个解决方案

解决方案1
0 已采纳 2017-02-13 18:44:04

解决方案2
0 2017-02-13 19:17:42

解决方案3
0 2017-02-15 02:08:10

用于比较 CSV 文件中的行。 哪种数据结构最好用

问题描述

3 个解决方案

解决方案1 0 已采纳 2017-02-13 18:44:04

解决方案2 0 2017-02-13 19:17:42

解决方案3 0 2017-02-15 02:08:10

用于比较 CSV 文件中的行。哪种数据结构最好用

解决方案1
0 已采纳 2017-02-13 18:44:04

解决方案2
0 2017-02-13 19:17:42

解决方案3
0 2017-02-15 02:08:10