简体   繁体   中英

Multinomial naive bayes - sklearn

import numpy as np
from sklearn.naive_bayes import MultinomialNB
X = np.array([[0.25, 0.73], [0.12, 0.42], [0.53, 0.92], [0.11, 0.32]])
y = np.array([0, 0, 0, 1])
mnb = MultinomialNB()
mnb.fit(X, y)
mnb.predict([[0.11, 0.32]])

--> it predicts 0

Shouldn't it predict 1?

Not necessarily. You can't assume that just because a model has seen an observation it will predict the corresponding label correctly. This is especially true in a high bias algorithm like Naive Bayes . High bias models tend to oversimplify the relationship between your X and y , and what you're seeing here is a product of that. On top of that, you only fit 4 samples, which is far too few for a model to learn a robust relationship.

If you're curious how exactly the model is creating these predictions, Multinomial Naive Bayes learns the joint log likelihoods of each class. You can actually compute those likelihoods using your fitted model:

>>> jll = mnb._joint_log_likelihood(X)
>>> jll
array([[-0.87974542, -2.02766662],
       [-0.60540174, -1.73662711],
       [-1.24051492, -2.36300468],
       [-0.54761186, -1.66776584]])

From there, the predict stage takes the argmax of the classes, which is where the class label prediction comes from:

>>> mnb.classes_[np.argmax(jll, axis=1)]
array([0, 0, 0, 0])

You can see that as it currently stands, the model will predict 0 for all of the samples you've provided.

It depends . Here, you are using only one sample that belongs to class 1 during the fitting/training. Also, you have only 4 features for each sample and only 4 samples so again, the training will be poor .

import numpy as np
from sklearn.naive_bayes import MultinomialNB
X = np.array([[0.25, 0.73], [0.12, 0.42], [0.53, 0.92], [0.11, 0.32]])
y = np.array([0, 0, 0, 1])
mnb = MultinomialNB()
mnb.fit(X, y)

mnb.predict([[0.11, 0.32]])
array([0])
mnb.predict([[0.25, 0.73]])
array([0])

The model learns the rule and can successfully predict class 0 but not class 1 . This is also known as the trade-off between specificity and sensitivity. We also refer to that by saying that the model can not generalize the rule.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM