简体   繁体   English

LDA忽略n_components?

[英]LDA ignoring n_components?

When I am trying to work with LDA from Scikit-Learn, it keeps only giving me one component, even though I am asking for more: 当我尝试使用Scikit-Learn的LDA时,它只给我一个组件,即使我要求更多:

>>> from sklearn.lda import LDA
>>> x = np.random.randn(5,5)
>>> y = [True, False, True, False, True]
>>> for i in range(1,6):
...     lda = LDA(n_components=i)
...     model = lda.fit(x,y)
...     model.transform(x)

Gives

/Users/orthogonal/virtualenvs/osxml/lib/python2.7/site-packages/sklearn/lda.py:161: UserWarning: Variables are collinear
  warnings.warn("Variables are collinear")
array([[-0.12635305],
       [-1.09293574],
       [ 1.83978459],
       [-0.37521856],
       [-0.24527725]])
array([[-0.12635305],
       [-1.09293574],
       [ 1.83978459],
       [-0.37521856],
       [-0.24527725]])
array([[-0.12635305],
       [-1.09293574],
       [ 1.83978459],
       [-0.37521856],
       [-0.24527725]])
array([[-0.12635305],
       [-1.09293574],
       [ 1.83978459],
       [-0.37521856],
       [-0.24527725]])
array([[-0.12635305],
       [-1.09293574],
       [ 1.83978459],
       [-0.37521856],
       [-0.24527725]])

As you can see, it's only printing out one dimension each time. 如您所见,它每次只打印一个维度。 Why is this? 为什么是这样? Does it have anything to do with the variables being collinear? 它与共线变量有什么关系吗?

Additionally, when I do this with Scikit-Learn's PCA, it gives me what I want. 另外,当我使用Scikit-Learn的PCA进行此操作时,它会给我我想要的东西。

>>> from sklearn.decomposition import PCA
>>> for i in range(1,6):
...     pca = PCA(n_components=i)
...     model = pca.fit(x)
...     model.transform(x)
... 
array([[ 0.83688322],
       [ 0.79565477],
       [-2.4373344 ],
       [ 0.72500848],
       [ 0.07978792]])
array([[ 0.83688322, -1.56459039],
       [ 0.79565477,  0.84710518],
       [-2.4373344 , -0.35548589],
       [ 0.72500848, -0.49079647],
       [ 0.07978792,  1.56376757]])
array([[ 0.83688322, -1.56459039, -0.3353066 ],
       [ 0.79565477,  0.84710518, -1.21454498],
       [-2.4373344 , -0.35548589, -0.16684946],
       [ 0.72500848, -0.49079647,  1.09006296],
       [ 0.07978792,  1.56376757,  0.62663807]])
array([[ 0.83688322, -1.56459039, -0.3353066 ,  0.22196922],
       [ 0.79565477,  0.84710518, -1.21454498, -0.15961993],
       [-2.4373344 , -0.35548589, -0.16684946, -0.04114339],
       [ 0.72500848, -0.49079647,  1.09006296, -0.2438673 ],
       [ 0.07978792,  1.56376757,  0.62663807,  0.2226614 ]])
array([[  8.36883220e-01,  -1.56459039e+00,  -3.35306597e-01,
          2.21969223e-01,  -1.66533454e-16],
       [  7.95654771e-01,   8.47105182e-01,  -1.21454498e+00,
         -1.59619933e-01,   3.33066907e-16],
       [ -2.43733440e+00,  -3.55485895e-01,  -1.66849458e-01,
         -4.11433949e-02,   0.00000000e+00],
       [  7.25008484e-01,  -4.90796471e-01,   1.09006296e+00,
         -2.43867297e-01,  -1.38777878e-16],
       [  7.97879229e-02,   1.56376757e+00,   6.26638070e-01,
          2.22661402e-01,   2.22044605e-16]])

This is the relevant, dimension-reducing line of LDA.transform , it uses scalings_ . 是相关的,尺寸降低线LDA.transform ,它使用scalings_ As described in the docstring , scalings_ has maximally n_classes - 1 columns. 如在所描述的文档字符串scalings_已经最大限度n_classes - 1列。 This is then the maximal number of columns you can hope to obtain using transform . 这是您希望使用transform获得的最大列数。 In your case, 2 classes (True, False) , yields maximally 1 column. 在您的情况下,2个类(True, False) ,最多产生1列。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 n_components 不能大于 min(n_features, n_classes - 1)。 在执行 LDA 时 - n_components cannot be larger than min(n_features, n_classes - 1). while performing LDA LDA(n_components = 2)+ fit_transform返回1维矩阵而不是2维矩阵 - LDA(n_components = 2) + fit_transform return 1-dim matrix instead of 2-dim sklearn.pca()和n_components,线性代数难题 - sklearn.pca() and n_components, linear algebra dilemma 确定pca分析中n_components变量的值 - Determine the value of n_components variable in pca analysis 类型错误:PCA() 得到了一个意外的关键字参数“n_components” - TypeError: PCA() got an unexpected keyword argument 'n_components' TypeError:__init __()获得了意外的关键字参数'n_components' - TypeError: __init__() got an unexpected keyword argument 'n_components' 确定 PCA 的 n_components 使得解释的方差比为 0.99 - Determine n_components of PCA such that the explained variance ratio is 0.99 tweetopic.dmm.DMM class 的“n_components”参数是什么? - What is the 'n_components' parameter for tweetopic.dmm.DMM class? 如何在Scikit-learn中使用`Dirichlet Process Gaussian Mixture Model`? (n_components?) - How to use `Dirichlet Process Gaussian Mixture Model` in Scikit-learn? (n_components?) 带有 n_components = 'mle' 和 svd_solver = 'full' 的 sklearn PCA 导致数学域错误 - sklearn PCA with n_components = 'mle' and svd_solver = 'full' results in math domain error
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM