简体   繁体   中英

Java 8 Streams multiple grouping By

I have a temperature record something like this

dt        |AverageTemperature |AverageTemperatureUncertainty|City   |Country |Latitude|Longitude
----------+-------------------+-----------------------------+-------+--------+--------+---------
1963-01-01|-5.417000000000002 |0.5                          |Karachi|Pakistan|57.05N  |10.33E  
1963-02-01|-4.7650000000000015|0.328                        |Karachi|Pakistan|57.05N  |10.33E  
1964-01-01|-5.417000000000002 |0.5                          |Karachi|Pakistan|57.05N  |10.33E  
1964-02-01|-4.7650000000000015|0.328                        |Karachi|Pakistan|57.05N  |10.33E  
1965-01-01|11.417000000000002 |0.5                          |Karachi|Pakistan|57.05N  |10.33E 
1965-02-01|12.7650000000000015|0.328                        |Karachi|Pakistan|57.05N  |10.33E

I have to parse this into a POJO and calculate the average delta as per the following problem statement:

Use the Streams API to calculate the average annual temperature delta for each country. To calculate delta the average temperature in 1900 would be subtracted from the average temperature in 1901 to obtain the delta from 1900 to 1901 for a particular city. The average of all these deltas is the average annual temperature delta for a city. The average of all cities in a country is the average of a country.

My Temperate POJO looks like following having getters and setters

public class Temperature {
    private java.util.Date date;
    private double averageTemperature;
    private double averageTemperatureUncertainty;
    private String city;
    private String country;
    private String latitude;
    private String longitude;
}

I have maintained a list of temperatures as this problem is to be achieved using streams.

To calculate the delta I am trying to use the following streams but I am still unable to calculate the actual delta, as I have to calculate the average country delta, I have performed grouping over country, city and date.

Map<String, Map<String, Map<Integer, Double>>> countriesMap = this.getTemperatures().stream()
                .sorted(Comparator.comparing(Temperature::getDate))
                .collect(Collectors.groupingBy(Temperature::getCountry,
                        Collectors.groupingBy(Temperature::getCity,
                        Collectors.groupingBy
                                (t -> {
                                            Calendar calendar = Calendar.getInstance();
                                            calendar.setTime(t.getDate());
                                            return calendar.get(Calendar.YEAR);
                                        }, 
                        Collectors.averagingDouble(Temperature::getAverageTemperature)))));

In order to calculate the delta we will have to calculate the differences for the Map<Integer, Double> .

For calculating the difference I came up with the following code but couldn't connect following code with the above one

Stream.of(10d, 20d, 10d) //this is sample data that I that I get in `Map<Integer, Double>` of countriesMap
        .map(new Function<Double, Optional<Double>>() {
            Optional<Double> previousValue = Optional.empty();
            @Override
            public Optional<Double> apply(Double current) {
                Optional<Double> value = previousValue.map(previous -> current - previous);
                previousValue = Optional.of(current);
                return value;
            }
        })
        .filter(Optional::isPresent)
        .map(Optional::get)
        .forEach(System.out::println);

How can I calculate the delta using streams in one go or how to perform stream operations over countriesMap in order to calculate delta and acheive the mentioned problem statment.?

To cut down the problem statement into a smaller block, another approach that you could look into is parsing through the year ly temperature and computing the delta for them, further average ing it. This would though have to be done for all the values of type Map<Integer, Double> within the inner Map in your question. It would look something like:

Map<Integer, Double> unitOfWork = new HashMap<>(); // innermost map you've attained ('yearToAverageTemperature' map)
unitOfWork = unitOfWork.entrySet()
        .stream()
        .sorted(Map.Entry.comparingByKey())
        .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e1, LinkedHashMap::new));
// the values sorted based on the year from a sorted map
List<Double> srtedValPerYear = new ArrayList<>(unitOfWork.values());
// average of deltas from the complete list 
double avg = IntStream.range(0, srtedVal.size() - 1)
        .mapToDouble(i -> (srtedVal.get(i + 1) - srtedVal.get(i)))
        .average().orElse(Double.NaN);

To note further, this is just an average of one City 's record of <Year, AverageTemperature> , you would have to iterate through all your City keyset and similarly for all your Country keyset to exhaustively find out such averages.

Further moving this unit of work into a method, iterating through the complete map of maps, this might be accomplished as :

// The average of all cities in a country is the average of a country.
AtomicReference<Double> countryValAvg = new AtomicReference<>(0.0);
countriesMap.forEach((country, cityMap) -> {
    // The average of all these deltas is the average annual temperature delta for a city.
    AtomicReference<Double> cityAvgTemp = new AtomicReference<>((double) 0);
    cityMap.forEach((city, yearMap) -> cityAvgTemp.set(cityAvgTemp.get() + averagePerCity(yearMap)));
    double avgAnnualTempDeltaPerCity = cityAvgTemp.get() / cityMap.size();

    countryValAvg.set(countryValAvg.get() + avgAnnualTempDeltaPerCity);
});
System.out.println(countryValAvg.get() / countriesMap.size());

where averagePerCity is the method that does following

double averagePerCity(Map<Integer, Double> unitOfWork) {
    unitOfWork = unitOfWork.entrySet()
            .stream()
            .sorted(Map.Entry.comparingByKey())
            .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e1, LinkedHashMap::new));
    List<Double> srtedVal = new ArrayList<>(unitOfWork.values());
    return IntStream.range(0, srtedVal.size() - 1)
            .mapToDouble(i -> (srtedVal.get(i + 1) - srtedVal.get(i)))
            .average().orElse(Double.NaN);
}

Note : The code above might be missing validations, its just to provide an idea of how could the complete problem be broken into smaller parts and then solved.

Edit1 : Which could be improved further as :

// The average of all cities in a country is the average of a country.
AtomicReference<Double> countryValAvg = new AtomicReference<>(0.0);
countriesMap.forEach((country, cityMap) -> {
    // The average of all these deltas is the average annual temperature delta for a city.
    double avgAnnualTempDeltaPerCity = cityMap.values()
            .stream()
            .mapToDouble(Quick::averagePerCity) // Quick is my class name
            .average()
            .orElse(Double.NaN);
    countryValAvg.set(countryValAvg.get() + avgAnnualTempDeltaPerCity);
});
System.out.println(countryValAvg.get() / countriesMap.size());

Edit2 : And further to

double avgAnnualTempDeltaPerCity = countriesMap.values().stream()
        .mapToDouble(cityMap -> cityMap.values()
                .stream()
                .mapToDouble(Quick::averagePerCity) // Quick is my class name
                .average()
                .orElse(Double.NaN))
        .average().orElse(Double.NaN);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM