How to select optimal number of components for NMF in python sklearn?

Question

There is not a built-in function in python's sklearn to do this.

In my research I found out that a "precision score" err(components) can be calculated via

The optimal number of components will have the minimum err(c).

Given the below test code, how can the precision score be implemented in python?

import numpy as np
import pandas as pd
from sklearn.decomposition import NMF
X = np.random.rand(40, 100) # create matrix for NMF
c = 4
model = NMF(n_components=c, init='random', random_state=0)
W = model.fit_transform(X)
H = model.components_

Answer 1

I'm not sure about the transposition in your formula since sklearn seems to transpose already H but this should do the trick

err = np.linalg.norm(X - W @ H)**2/np.linalg.norm(X)**2
print(err)

Answer 2

Simply use the built in function "reconstruction error" as follows:

err = model.reconstruction_err_

How to select optimal number of components for NMF in python sklearn?

Question

2 answers

solution1
1 2021-09-06 08:15:18

solution2
0 2021-10-20 02:55:06

How to select optimal number of components for NMF in python sklearn?

Question

2 answers

solution1 1 2021-09-06 08:15:18

solution2 0 2021-10-20 02:55:06

solution1
1 2021-09-06 08:15:18

solution2
0 2021-10-20 02:55:06