简体   繁体   English

使用 HashMaps 到 map 一个键的总和平均值

[英]Using HashMaps to map a total and average value to a key

I have a Country class and have read in data from a.csv file which contains many countries names, the region they're in, the population of each country, the area etc, and stored it in an ArrayList.我有一个国家 class 并从 a.csv 文件中读取数据,该文件包含许多国家名称、他们所在的地区、每个国家的人口、地区等,并将其存储在 Z57A97A39435CFDFCEED96E03F2A3BC27. I am carrying out a data analysis mostly using the java collections framework, and want to find both the total and average population for each region.我主要使用 java collections 框架进行数据分析,并希望找到每个地区的人口和平均人口

I've figured using a HashMap is best for this, but I don't know how to go about this as I've never used one before in any complex way or with objects.我认为使用 HashMap 是最好的,但我不知道如何使用 go,因为我以前从未以任何复杂的方式或对象使用过。 I also know I'll have to change the datatype of the int to a long for the total population.我也知道对于总人口,我必须将 int 的数据类型更改为 long。

public class Country {
    

    private String name;
    private String region;
    private int population;
    private int area;
    private double density;

    /**
     * Default constructor
     */
    public Country() {

    }

    /**
     * Creates a country with all args
     * 
     * @param name
     * @param region
     * @param population
     * @param area
     * @param density
     */
    public Country(String name, String region, int population, int area, double density) {
        super();
        this.name = name;
        this.region = region;
        this.population = population;
        this.area = area;
        this.density = density;
    }

/**
     * @return the region
     */
    public String getRegion() {
        return region;
    }

    /**
     * @param region the region to set
     */
    public void setRegion(String region) {
        this.region = region;
    }

/**
     * @return the population
     */
    public int getPopulation() {
        return population;
    }

    /**
     * @param population the population to set
     */
    public void setPopulation(int population) {
        this.population = population;
    }



public static void totalPopulationByRegion(Collection<Country> countries) {
        Map<String, Integer> map = new HashMap<String, Integer>();

        int total = 0;

        for (Country country : countries) {
            if (map.containsKey(country.getRegion())) {
                map.put(country.getRegion(), total);
                total+=country.getPopulation();
            } else
                map.put(country.getRegion(), total);
        }

        for (Map.Entry m : map.entrySet()) {
            System.out.println(m.getKey() + " " + m.getValue());
        }
    }

From the output I get on the console I realise my maths logic is all wrong on this, even accounting for the fact that I haven't dealt with the numbers being too large to store as an int.从 output 进入控制台,我意识到我的数学逻辑在这方面完全错误,即使考虑到我没有处理过大而无法存储为 int 的数字这一事实。 I get no duplicates for the key which is what I wanted, I just don't know how to get an accumulative total for the populations which map to each region.我没有得到我想要的密钥的重复项,我只是不知道如何获得 map 到每个地区的人口的累积总数。 Any help with this would be appreciated.对此的任何帮助将不胜感激。

Output I'm getting when called from the main method: Output 从主要方法调用时得到:


Near east 41843152
Asia -478957430
Europe -7912568
Africa 54079957
Latin amer. & carib 17926472
Northern america -35219702
Baltics -1102504495
Oceania -616300040

Sample from csv file: csv 文件中的示例:

Country,Region,Population,Area (sq. mi.)
Afghanistan,ASIA,31056997,647500
Albania,EASTERN EUROPE                     ,3581655,28748
Algeria ,NORTHERN AFRICA                    ,32930091,2381740
American Samoa ,OCEANIA                            ,57794,199
Andorra ,WESTERN EUROPE                     ,71201,468
Angola ,SUB-SAHARAN AFRICA                 ,12127071,1246700
Anguilla ,LATIN AMER. & CARIB    ,13477,102
Antigua & Barbuda ,LATIN AMER. & CARIB    ,69108,443
Argentina ,LATIN AMER. & CARIB    ,39921833,2766890

If you just want to group region with it's total population, then you need to modify your code a bit.如果您只想将区域与其总人口分组,那么您需要稍微修改您的代码。 The variable total should be declared inside your for loop and it should be initialized with the country's population.变量total应该在你的for循环中声明,并且应该使用国家的人口进行初始化。

public static void totalPopulationByRegion(Collection<Country> countries) {
        Map</*Region*/ String, /*Population*/ Long> map = new HashMap<>();

        for (Country country : countries) {
            long total = country.getPopulation();
            if (map.containsKey(country.getRegion())) {
                total+=country.getPopulation();
            }
            map.put(country.getRegion(), total);
        }

        for (Map.Entry m : map.entrySet()) {
            System.out.println(m.getKey() + " " + m.getValue());
        }
    }

However if you wish to have more handle on the data, then it would be easier if you group by region and Country itself and cache it for future use something like this:但是,如果您希望更多地处理数据,那么如果您按地区和Country本身分组并将其缓存以供将来使用,则会更容易,如下所示:

Map<String, List<Country>> groupData(Collection<Country> countries) {
        Map</*Region*/String, List<Country>> map = new HashMap<>();

        for (Country country : countries) {
            List<Country> regionCountries = new ArrayList<>();
            if (map.containsKey(country.getRegion())) {
                regionCountries = map.get(country.getRegion());
            }
            regionCountries.add(country);
            map.put(country.getRegion(), regionCountries);
        }
        return map;
    }

Then this data can be used to aggregate total and average population per region something like this (For the sake of convenience, I'm using Java 8 Stream APIs):然后这些data可用于汇总每个区域的总人口和平均人口,如下所示(为方便起见,我使用 Java 8 Stream API):

Map<String, Integer> getTotalPopulationPerRegion(Map<String, List<Country>> data) {
        Map<String, Integer> result = data.entrySet()
                .stream()
                .collect(Collectors.toMap(entry -> entry.getKey(), entry -> entry.getValue().stream().mapToInt(country -> country.getPopulation()).sum()));
        return result;
    }

Map<String, Double> getAveragePopulationPerRegion(Map<String, List<Country>> data) {
        Map<String, Double> result = data.entrySet()
                .stream()
                .collect(Collectors.toMap(entry -> entry.getKey(), entry -> entry.getValue().stream().mapToDouble(country -> country.getPopulation()).average().orElse(Double.NaN)));
        return result;
    }

Assuming you have already changed the type of population from int to long in your country class假设您已经将您所在国家/地区的人口类型从 int 更改为 long class

public static class Country {
    private String name;
    private String region;
    private long population;
    ...
}

Here are some ways to achieve what you need:以下是实现您需要的一些方法:

public static void totalPopulationByRegion(Collection<Country> countries) {
    Map<String, Long> map = new HashMap<>();

    for (Country country : countries) {
        if (map.containsKey(country.getRegion())) {
            //if the map contains the region get the value and add the population of current country
            map.put(country.getRegion(), map.get(country.getRegion()) + country.getPopulation());
        } else{
            //else just put region of current country and population into the map
            map.put(country.getRegion(), country.getPopulation());
        }
    }

    for (Map.Entry m : map.entrySet()) {
        System.out.println(m.getKey() + " " + m.getValue());
    }
}

If you are using Java 8 or higher the above can be shortend using Map#computeIfPresent and Map#computeIfAbsent and avoiding the if else block如果您使用 Java 8 或更高版本,则可以使用Map#computeIfPresentMap#computeIfAbsent并避免 if else 块

public static void totalPopulationByRegion2(Collection<Country> countries) {
    Map<String, Long> map = new HashMap<>();

    for (Country country : countries) {
        map.computeIfPresent(country.getRegion(), (reg, pop)->  pop + country.getPopulation());
        map.computeIfAbsent(country.getRegion(), reg -> country.getPopulation());                   
    }

    for (Map.Entry m : map.entrySet()) {
        System.out.println(m.getKey() + " " + m.getValue());
    }
}

Using the streams API the task to create the map can become a oneliner using Collectors#groupingBy and Collectors#summingLong使用流 API 创建 map 的任务可以使用Collectors#groupingByCollectors#summingLong成为单线

public static void totalPopulationByRegion3(Collection<Country> countries) {
    Map<String, Long> map = 
            countries.stream()
                     .collect(Collectors.groupingBy(Country::getRegion, 
                                                    Collectors.summingLong(Country::getPopulation)));

    for (Map.Entry m : map.entrySet()) {
        System.out.println(m.getKey() + " " + m.getValue());
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM