简体   繁体   中英

How to select optimal number of components for NMF in python sklearn?

There is not a built-in function in python's sklearn to do this.

In my research I found out that a "precision score" err(components) can be calculated via

在此处输入图像描述

The optimal number of components will have the minimum err(c).

Given the below test code, how can the precision score be implemented in python?

import numpy as np
import pandas as pd
from sklearn.decomposition import NMF
X = np.random.rand(40, 100) # create matrix for NMF
c = 4
model = NMF(n_components=c, init='random', random_state=0)
W = model.fit_transform(X)
H = model.components_

I'm not sure about the transposition in your formula since sklearn seems to transpose already H but this should do the trick

err = np.linalg.norm(X - W @ H)**2/np.linalg.norm(X)**2
print(err)

Simply use the built in function "reconstruction error" as follows:

err = model.reconstruction_err_

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM