简体   繁体   English

使用 LinkedHashSet 时计数外观

[英]Counting appearance while using LinkedHashSet

I have an XML file in which I need to find and count the appearance of year tag.我有一个 XML 文件,我需要在其中找到并计算年份标签的外观。 For example:例如:

Found year 2020 10 times.
Found year 2017 1 times.
Found year 2019 2 times. 
(...)

To avoid the duplications of the years I used HashSet.为了避免多年来的重复,我使用了 HashSet。 Code:代码:

public class Publications {
    public static void main(String[] args) throws IOException {
        Set<String> publicationYears = new LinkedHashSet<>();
        try (BufferedReader reader = Files.newBufferedReader(Paths.get("dblp-2020-04-01.xml"))) {
            Pattern pattern = Pattern.compile("<year>(.+?)</year>", Pattern.DOTALL);
            for (String line; (line = reader.readLine()) != null; ) {
                Matcher matcher = pattern.matcher(line);
                if (matcher.find()) {
                    String year = matcher.group(1);
                    publicationYears.add(year);
                }
            }
        }

Results:结果:

2010
2002
1992
1994
1993
2006(...)

But now I can't find an efficient code to count the appearance of each year.但是现在我找不到一个有效的代码来计算每年的外观。 Creating an multidimensional array and then searching would be very slow.创建一个多维数组然后搜索会很慢。 Any suggestions?有什么建议么?

Try this:尝试这个:

  • I replaced the set with a map.我用 map 替换了该套件。
  • The statement that does the work is完成这项工作的语句是
        count.compute(year, (k,v)->v == null ? 1 : v + 1); 
  • It simply puts 1 for the year if the year when it first encounters it, otherwise it adds 1 to that year.如果它第一次遇到它的年份,它只会为年份添加 1,否则它会在该年份添加 1。
   Map<String, Integer> count = new LinkedHashMap<>();
        try (BufferedReader reader = Files.newBufferedReader(Paths.get("dblp-2020-04-01.xml"))) {
            Pattern pattern = Pattern.compile("<year>(.+?)</year>", Pattern.DOTALL);
            for (String line; (line = reader.readLine()) != null; ) {
                Matcher matcher = pattern.matcher(line);
                if (matcher.find()) {
                    String year = matcher.group(1);
                    count.compute(year, (k,v)->v == null ? 1 : v + 1);
                }
            }
        }
    }

To print them out, do the following要打印出来,请执行以下操作

count.entrySet().forEach(System.out::println);

There are so many ways to do it.有很多方法可以做到这一点。 Some of them are given below:其中一些如下:

  1. Add all the years to a List and use Collections.frequency as follows:将所有年份添加到List并使用Collections.frequency如下:
import java.util.Collections;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Set;

public class Main {
    public static void main(String[] args) {
        List<Integer> years = List.of(2010, 2011, 2013, 2010, 2012, 2010, 2011, 2010, 2014);
        Set<Integer> yearSet = new LinkedHashSet<Integer>(years);
        for (Integer year : yearSet) {
            System.out.println("Found year " + year + " " + Collections.frequency(years, year) + " times");
        }
    }
}

Output: Output:

Found year 2010 4 times
Found year 2011 2 times
Found year 2013 1 times
Found year 2012 1 times
Found year 2014 1 times
  1. Add all the years to a List and then create a Map of frequency as follows:将所有年份添加到List ,然后创建频率如下的Map
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

public class Main {
    public static void main(String[] args) {
        List<Integer> years = List.of(2010, 2011, 2013, 2010, 2012, 2010, 2011, 2010, 2014);
        Map<Integer, Integer> frequencyMap = new LinkedHashMap<Integer, Integer>();
        for (Integer year : years) {
            if (frequencyMap.get(year) == null) {
                frequencyMap.put(year, 1);
            } else {
                frequencyMap.put(year, frequencyMap.get(year) + 1);
            }
        }
        for (Entry<Integer, Integer> entry : frequencyMap.entrySet()) {
            System.out.println("Found year " + entry.getKey() + " " + entry.getValue() + " times");
        }
    }
}

Output: Output:

Found year 2010 4 times
Found year 2011 2 times
Found year 2013 1 times
Found year 2012 1 times
Found year 2014 1 times
  1. Add all the years to a List and then create a Map of frequency by using Map::merge as follows:将所有年份添加到List ,然后使用Map::merge创建频率的Map ,如下所示:
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;

public class Main {
    public static void main(String[] args) {
        List<Integer> years = List.of(2010, 2011, 2013, 2010, 2012, 2010, 2011, 2010, 2014);
        Map<Integer, Integer> frequencyMap = new LinkedHashMap<Integer, Integer>();
        years.forEach(year -> frequencyMap.merge(year, 1, (oldValue, newValue) -> oldValue + newValue));
        for (Entry<Integer, Integer> entry : frequencyMap.entrySet()) {
            System.out.println("Found year " + entry.getKey() + " " + entry.getValue() + " times");
        }
    }
}

Output: Output:

Found year 2010 4 times
Found year 2011 2 times
Found year 2013 1 times
Found year 2012 1 times
Found year 2014 1 times

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM