简体   繁体   English

Java - 初始化一个 HashMap 的 HashMap

[英]Java - Initialize a HashMap of HashMaps

I am new to java and practicing by creating a simplistic NaiveBayes classifier.我是 Java 新手,通过创建一个简单的 NaiveBayes 分类器来练习。 I am still new to object instantiation, and wonder what to do to initialize a HashMap of HashMaps.我对对象实例化还是个​​新手,想知道如何初始化 HashMap 的 HashMap。 When inserting new observations into the classifier, I can create a new HashMap for an unseen feature name in a given class, but do I need to initialize?将新观察值插入分类器时,我可以为给定类中的未见过的特征名称创建一个新的 HashMap,但是我需要初始化吗?

import java.util.HashMap;

public class NaiveBayes {

    private HashMap<String, Integer> class_counts;
    private HashMap<String, HashMap<String, Integer>> class_feature_counts;

    public NaiveBayes() {
        class_counts = new HashMap<String, Integer>();
        // do I need to initialize class_feature_counts?
    }

    public void insert() {
        // todo
        // I think I can create new hashmaps on the fly here for class_feature_counts
    }

    public String classify() {
        // stub 
        return "";
    }

    // Naive Scoring:
    // p( c | f_1, ... f_n) =~ p(c) * p(f_1|c) ... * p(f_n|c)
    private double get_score(String category, HashMap features) {
       // stub
       return 0.0;
    }

    public static void main(String[] args) {
        NaiveBayes bayes = new NaiveBayes();
       // todo
     }
}

Note this question is not specific to Naive Bayes classifiers, just thought I would provide some context.请注意,这个问题并非特定于朴素贝叶斯分类器,只是想我会提供一些上下文。

Yes, you need to initialize it.是的,你需要初始化它。

class_feature_counts = new HashMap<String, HashMap<String, Integer>>();

When you want to add a value to class_feature_counts, you need to instantiate it too:当你想给 class_feature_counts 添加一个值时,你也需要实例化它:

HashMap<String, Integer> val = new HashMap<String, Integer>();
// Do what you want to do with val
class_feature_counts.put("myKey", val);

Recursive generic data structures, like maps of maps, while not an outright bad idea, are often indicative of something you could refactor - the inner map often could be a first order object (with named fields or an internal map), rather than simply a map.递归通用数据结构,如映射映射,虽然不是一个彻头彻尾的坏主意,但通常表示可以重构的东西——内部映射通常可以是一阶对象(具有命名字段或内部映射),而不是简单的一个地图。 You'll still have to initialize these inner objects, but it often is a much cleaner, clearer way to develop.您仍然需要初始化这些内部对象,但它通常是一种更清晰、更清晰的开发方式。

For instance, if you have a Map<A,Map<B,C>> you're often really storing a map of A to Thing, but the way Thing is being stored is coincidentally a map.例如,如果您有一个Map<A,Map<B,C>>您通常实际上是在存储 A 到 Thing 的映射,但是 Thing 的存储方式恰好是一个映射。 You'll often find it cleaner and easier to hide the fact that Thing is a map, and instead store a mapping of Map<A,Thing> where thing is defined as:你经常会发现隐藏 Thing 是一个映射的事实更清晰、更容易,而是存储Map<A,Thing>的映射,其中 thing 定义为:

public class Thing {
    // Map is guaranteed to be initialized if a Thing exists
    private Map<B,C> data = new Map<B,C>();

    // operations on data, like get and put
    // now can have sanity checks you couldn't enforce when the map was public
}

Also, look into Guava's Mulitmap/Multiset utilities, they're very useful for cases like this, in particular they do the inner-object initializations automatically.此外,查看 Guava 的Mulitmap/Multiset实用程序,它们对于此类情况非常有用,特别是它们会自动进行内部对象初始化。 Of note for your case, just about any time you implement Map<E, Integer> you really want a Guava Multiset.对于您的情况,请注意,几乎任何时候您实现Map<E, Integer>您都确实需要一个 Guava Multiset。 Cleaner and clearer.更干净、更清晰。

You must create an object before using it via a reference variable.必须先创建一个对象,然后才能通过引用变量使用它。 It doesn't matter how complex that object is.该对象有多复杂并不重要。 You aren't required to initialize it in the constructor, although that is the most common case.您不需要在构造函数中初始化它,尽管这是最常见的情况。 Depending on your needs, you might want to use "lazy initialization" instead.根据您的需要,您可能希望改用“延迟初始化”。

  1. Do not declare your variables with HashMap .不要用HashMap声明你的变量。 It's too limiting.太局限了。
  2. Yes, you need to initialize class_feature_counts .是的,您需要初始化class_feature_counts You'll be adding entries to it, so it has to be a valid map.您将向其中添加条目,因此它必须是有效的映射。 In fact, initialize both at declaration and not in the constructor since there is only one way for each to start.事实上,在声明时初始化而不是在构造函数中初始化,因为每个方法都只有一种启动方式。 I hope you're using Java 7 by now;我希望你现在正在使用 Java 7; it's simpler this way.这种方式更简单。

    private Map< String, Integer> classCounts = new HashMap<>(); private Map<String, Integer> classCounts = new HashMap<>();

    private Map< String, Map< String, Integer>> classFeatureCounts = new HashMap<>(); private Map< String, Map< String, Integer>> classFeatureCounts = new HashMap<>();

The compiler will deduce the types from the <>.编译器将从 <> 推导出类型。 Also, I changed the variable names to standard Java camel-case style.此外,我将变量名称更改为标准 Java 驼峰式风格。 Are classCounts and classFeatureCounts connected? classCountsclassFeatureCounts是否相连?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM