简体   繁体   中英

Count frequency of elements in java

I am trying to count the frequency of all dates from a text file. The dates are stored in parsed.get(0) but when I print the frequency I get this output:

1946-01-12: 1
1946-01-12: 1
1946-01-12: 1
1946-01-13: 1
1946-01-13: 1
1946-01-13: 1
1946-01-14: 1
1946-01-14: 1
1946-01-14: 1
1946-01-15: 1

instead of

1946-01-12: 3
1946-01-13: 3
1946-01-14: 3
1946-01-15: 1

I guess it is because I have to store the dates like ("1946-01-12", "1946-01-12", "1946-01-12", "1946-01-12", "1946-01-13", "1946-01-13",...). If I just print parsed.get(0) I get

1946-01-12
1946-01-12
1946-01-12
1946-01-13
1946-01-13
1946-01-13
1946-01-14
1946-01-14
1946-01-14
1946-01-15`

How can I solve it based on my code below?

private static List<WeatherDataHandler> weatherData = new ArrayList<>();
public void loadData(String filePath) throws IOException {

//Read all data
    List<String> fileData = Files.readAllLines(Paths.get("filePath"));
    System.out.println(fileData);

    for(String str : fileData) {
        List<String> parsed = parseData(str);
        LocalDate dateTime = LocalDate.parse(parsed.get(0));

        WeatherDataHandler weather = new WeatherDataHandler(dateTime, Time, temperature, tag);
        weatherData.add(weather);

        List<String> list = Arrays.asList(parsed.get(0));

        Map<String, Long> frequencyMap =
                list.stream().collect(Collectors.groupingBy(Function.identity(), 
                                                        Collectors.counting()));

            for (Map.Entry<String, Long> entry : frequencyMap.entrySet()) {
                System.out.println(entry.getKey() + ": " + entry.getValue());
            }
    }

The problem

Everything inside the for-loop is performed on each iteration . So you are recreating your collection of dates and recreating the stream for analysis over and over again. Not good.

The solution

Move the stream and analysis code outside the for-loop.

Rethink your code as two phases.

  • First phase is parsing the inputs, preprocessing the incoming data into a form you want to work with. In this case we need to read a text file, parse the lines into LocalDate objects, and add these objects to a collection. This code uses the for-loop.
  • Phase two is the stream work to process that reformed data, the collection of LocalDate objects. This code comes after the for-loop.

In my own work, I would literally put these bullet points in my code as comments. And I would add divider lines (comment lines with a bunch of comments or usual signs) to mark each phase in the code. And I might move each phase into a method as a subroutine.

By the way, once you get his working, for fun you might want to try replacing the for-loop reading the file with a stream. Java can read the file as a stream of lines.

i tested this one briefly you can check the result

{1946-01-14=3, 1946-01-15=1, 1946-01-12=3, 1946-01-13=3}

the original file was

1946-01-12: 1
1946-01-12: 1
1946-01-12: 1
1946-01-13: 1
1946-01-13: 1
1946-01-13: 1
1946-01-14: 1
1946-01-14: 1
1946-01-14: 1
1946-01-15: 1

modify it as you like

code:

 try {
            String content = new Scanner(new File("src/main/resources/test.txt")).useDelimiter("\\Z").next();
            String[] dates=  content.split("\\n");
            Map<String,Long> m
                    =
                    Arrays.stream(dates).
                            map(o->
                            {return o.split(":")[0];}) //not necessary if you dont have 1s in the text file
                            .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

            System.out.println(m.toString());
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }

有了这个,您可以获得列表中具有相同值的元素数量。

int numerOfElements = Collections.frequency(list, "1946-01-12");

Based on how I think this is working I would do it as follows. Comments included to explain additional logic. The main idea is to do as much inside your main loop as possible. Creating the frequenceyMap outside the loop with a stream is additional and unnecessary work.

    private static List<WeatherDataHandler> weatherData =
            new ArrayList<>();

    public void loadData(String filePath) throws IOException {

        // Read all data
        List<String> fileData =
                Files.readAllLines(Paths.get("filePath"));
        System.out.println(fileData);

        // Pre-instantiate the freqency map.
        Map<String, Long> frequencyMap = new LinkedHashMap<>();

        for (String str : fileData) {

            List<String> parsed = parseData(str);

            LocalDate dateTime =
                    LocalDate.parse(parsed.get(0));

            WeatherDataHandler weather = new WeatherDataHandler(
                    dateTime, Time, temperature, tag);
            weatherData.add(weather);

            // Ensure dateTime is a string.  This may not have the desired
            // format for date but that can be corrected by you
            String strDate = dateTime.toString();

            // Use the compute method of Map. If the count is null,
            // initialize it to 1, otherwise, add 1 to the existing value.
            frequencyMap.compute(strDate,
                    (date, count) -> count == null ? 1 : count + 1);
        }

        for (Map.Entry<String, Long> entry : frequencyMap
                .entrySet()) {
            System.out.println(
                    entry.getKey() + ": " + entry.getValue());
        }
    }

You can also print the map as follows:

frequencyMap.forEach((k,v)->System.out.println(k + ": " + v));

Finally, the above could have been simplified a few places like using Files.lines(path) to create a stream. But as you are also writing this to a WeatherDataHandler list and wanted to keep your structure, I did not use that feature.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM