简体   繁体   English

使用内核PCA分隔圈子

[英]Separating circles using kernel PCA

I am trying to reproduce a simple example of using kernel PCA. 我试图重现一个使用内核PCA的简单示例 The objective is to separate out the points from two concentric circles. 目的是将点与两个同心圆分开。

Creating the data: 创建数据:

circle <- data.frame(radius = rep(c(0, 1), 500) + rnorm(1000, sd = 0.05),
                     phi = runif(1000, 0, 2 * pi),
                     group = rep(c("A", "B"), 500))
#
circle <- transform(circle,
                    x = radius * cos(phi),
                    y = radius * sin(phi),
                    z = rnorm(length(radius))) %>% select(group, x, y, z)

TFRAC = 0.75
#
train <- sample(1:1000, TFRAC * 1000)

circle.train <- circle[train,]
circle.test <- circle[-train,]

> head(circle.train)
    group         x          y        z
491     A -0.034216 -0.0312062  0.70780
389     A  0.052616  0.0059919  1.05942
178     B -0.987276 -0.3322542  0.75297
472     B -0.808646  0.3962935 -0.17829
473     A -0.032227  0.0027470  0.66955
346     B  0.894957  0.3381633  1.29191

I have split the data up into training and testing sets because I have the intention (once I get this working!) of testing the resulting model. 我将数据分为训练和测试集,因为我有意图(一旦我能做得到!)就测试结果模型。

在此处输入图片说明

In principal kernel PCA should allow me to separate out the two classes. 原则上,PCA内核应允许我将这两个类分开。 Other discussions of this example have used the Radial Basis Function (RBF) kernel, so I adopted this too. 该示例的其他讨论都使用了径向基函数(RBF)内核,因此我也采用了这种方法。 In R kernel PCA is implemented in the kernlab package. 在R内核中,PCA是在kernlab软件包中实现的。

library(kernlab)

circle.kpca <- kpca(~ ., data = circle.train[, -1], kernel = "rbfdot", kpar = list(sigma = 10), features = 1)

I requested only the first component and specified the RBF kernel. 我只请求了第一个组件,并指定了RBF内核。 This is the result: 结果如下:

在此处输入图片说明

There has definitely been a major transformation of the data, but the transformed data is not what I was expecting (which would be a nice, clean separation of the two classes). 肯定已经对数据进行了一次重大转换,但是转换后的数据并不是我所期望的(这将是两个类的完美分离)。 I have tried fiddling with the value of the parameter sigma and, although the results do vary dramatically, I still didn't get what I was expecting. 我尝试摆弄参数sigma的值,尽管结果确实有很大的不同,但我仍然没有达到我的期望。 I assume that sigma is related to the parameter gamma mentioned here , possibly via the relationship given here (without the negative sign?). 我认为sigma与此处提到的参数gamma有关,可能是通过此处给定的关系(没有负号?)。

I'm pretty sure that I am making a naive rookie error here and I would really appreciate any pointers which would get me onto the right track. 我很确定自己在这里犯了一个幼稚的菜鸟错误,并且我真的很感谢能将我带入正确轨道的任何指针。

Thanks, Andrew. 谢谢,安德鲁。

Try sigma = 20. I think you will get the answer you are looking for. 尝试sigma =20。我想您会找到想要的答案。 The sigma in kernlab is actually what is usually referred to as gamma for rbf kernel so they are inversely related. 实际上,kernlab中的sigma实际上是rbf内核通常称为gamma的东西,因此它们成反比。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM