简体   繁体   English

点问题和将R应用于线性判别分析

[英]Problems with points and apply R for linear discriminant analysis

I have some coding question, which arise doing some exercises in linear discriminant analysis. 我有一些编码问题,这是在线性判别分析中进行一些练习而引起的。 We are using the Iris data: 我们正在使用虹膜数据:

## Read in dataset, set seed, load package
Iris <- iris[,-(1:2)]
grIris <- as.integer(iris[,"Species"])
set.seed(16)
library(MASS)

## Read n
n <- nrow(Iris)

As you can see, we delte the first and second column of iris. 如您所见,我们删除了虹膜的第一和第二列。 What I want to do is a bootstrap for this data using linear discriminant analysis, here is my code: 我想要做的是使用线性判别分析对此数据进行引导,这是我的代码:

ind <- replicate(B,sample(seq(1:n),n,replace=TRUE))

This generates the indices I want to use. 这将生成我要使用的索引。 Note B is some large number, eg 1000. Now I want to use apply, but why does the following code doesn't work? 注意B是一个较大的数字,例如1000。现在我要使用apply,但是为什么下面的代码不起作用?

bst.sample <- apply(ind,2,lda(Species~Petal.Length+Petal.Width,data=Iris[ind,]))

where Species, Petal.Length etc. are the data from iris. 其中Species,Petal.Length等是来自虹膜的数据。 If I use a for loop everything works fine, but of course I would like to implement in this more elegant way. 如果我使用for循环,则一切正常,但我当然想以这种更优雅的方式实现。

My second question is about points . 我的第二个问题是关于points I also wanted to calculate the estimated means, which I've done by the following code 我还想计算估计的均值,这是通过以下代码完成的

est.lda <- vector("list",B)
est.qda <- vector("list",B)
mu_hat_1 <- mu_hat_2 <- mu_hat_3 <- matrix(0,ncol=B,nrow=2)
for (i in 1:B){
  est.lda[[i]] <- lda(Species~Petal.Length+Petal.Width,data=Iris[ind[,i],])
  mu_hat_1[,i] <- est.lda[[i]]$means[1,]
  mu_hat_2[,i] <- est.lda[[i]]$means[2,]
  mu_hat_3[,i] <- est.lda[[i]]$means[3,]
  est.qda[[i]] <- qda(Species~Petal.Length+Petal.Width,data=Iris[ind[,i],])

}

plot(mu_hat_1[1,],mu_hat_1[2,],pch=4)
points(mu_hat_2[1,],mu_hat_2[2,],pch=4,col=2)
points(mu_hat_3[1,],mu_hat_3[2,],pch=4,col=3)

The plot at the end should show three region with the expected mean of the three classes. 最后的图应显示三个区域以及三个类别的预期平均值。 However just the first plot is shown. 但是,仅显示第一个图。

Thank you for your help. 谢谢您的帮助。

B <- 10
ind <- replicate(B,sample(seq(1:n),n,replace=TRUE))

#you need to pass a function to apply
bst.sample <- apply(ind,2, 
                function(i) lda(Species~Petal.Length+Petal.Width,data=Iris[i,]))
#extract means
bst.means <- lapply(bst.sample,function(x) x$means)

#bind means into array
library(abind)
bst.means <- do.call(function(...) abind(..., along=3), bst.means)

#you need to make sure that alle points are inside the axis limits
plot(bst.means[1,1,],bst.means[1,2,], 
     xlim=range(bst.means[,1,]), ylim=range(bst.means[,2,]), 
     xlab=dimnames(bst.means)[[2]][1],ylab=dimnames(bst.means)[[2]][2],
     col=1)
points(bst.means[2,1,],bst.means[2,2,], col=2)
points(bst.means[3,1,],bst.means[3,2,], col=3)
legend("topleft", legend=dimnames(bst.means)[[1]], col=1:3, pch=1)

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM