简体   繁体   English

R中线性判别分析中的分类函数

[英]Classification functions in linear discriminant analysis in R

After completing a linear discriminant analysis in R using lda() , is there a convenient way to extract the classification functions for each group? 在使用lda()完成R中的线性判别分析后,是否有方便的方法来提取每个组的分类函数

From the link, 从链接,

These are not to be confused with the discriminant functions. 不要将这些与判别函数混淆。 The classification functions can be used to determine to which group each case most likely belongs. 分类函数可用于确定每个案例最可能属于哪个组。 There are as many classification functions as there are groups. 分组功能与组相同。 Each function allows us to compute classification scores for each case for each group, by applying the formula: 每个函数允许我们通过应用公式计算每个组的每个案例的分类分数:

Si = ci + wi1*x1 + wi2*x2 + ... + wim*xm

In this formula, the subscript i denotes the respective group; 在该式中,下标i表示各自的组; the subscripts 1, 2, ..., m denote the m variables; 下标1,2,...,m表示m个变量; ci is a constant for the i'th group, wij is the weight for the j'th variable in the computation of the classification score for the i'th group; ci是第i组的常数,wij是计算第i组分类得分时第j个变量的权重; xj is the observed value for the respective case for the j'th variable. xj是第j个变量的相应情况的观测值。 Si is the resultant classification score. Si是得到的分类分数。

We can use the classification functions to directly compute classification scores for some new observations. 我们可以使用分类函数直接计算一些新观察的分类分数。

I can build them from scratch using textbook formulas, but that requires rebuilding a number of intermediate steps from the lda analysis. 我可以使用教科书公式从头开始构建它们,但这需要从lda分析重建许多中间步骤。 Is there a way to get them after the fact from the lda object? 有没有办法从lda对象中获取它们?

Added: 添加:

Unless I'm still misunderstanding something in Brandon's answer (sorry for the confusion!), it appears the answer is no. 除非我仍然在布兰登的回答中误解了一些事情(对不起这个混乱!),似乎答案是否定的。 Presumably the majority of users can get the information they need from predict() , which provides classifications based on lda() . 据推测,大多数用户可以从predict()获取所需的信息,提供基于lda()分类。

Suppose x is your LDA object: 假设x是您的LDA对象:

x$terms

You can have a peak at the object by looking at it's structure: 您可以通过查看对象的结构在对象上找到峰值:

str(x)

Update: 更新:

Iris <- data.frame(rbind(iris3[,,1], iris3[,,2], iris3[,,3]),Sp = rep(c("s","c","v"), rep(50,3)))
train <- sample(1:150, 75)
table(Iris$Sp[train])
z <- lda(Sp ~ ., Iris, prior = c(1,1,1)/3, subset = train)
predict(z, Iris[-train, ])$class
str(z)
List of 10
 $ prior  : Named num [1:3] 0.333 0.333 0.333
  ..- attr(*, "names")= chr [1:3] "c" "s" "v"
 $ counts : Named int [1:3] 30 25 20
  ..- attr(*, "names")= chr [1:3] "c" "s" "v"
 $ means  : num [1:3, 1:4] 6.03 5.02 6.72 2.81 3.43 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:3] "c" "s" "v"
  .. ..$ : chr [1:4] "Sepal.L." "Sepal.W." "Petal.L." "Petal.W."
 $ scaling: num [1:4, 1:2] 0.545 1.655 -1.609 -3.682 -0.443 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:4] "Sepal.L." "Sepal.W." "Petal.L." "Petal.W."
  .. ..$ : chr [1:2] "LD1" "LD2"
 $ lev    : chr [1:3] "c" "s" "v"
 $ svd    : num [1:2] 33.66 2.93
 $ N      : int 75
 $ call   : language lda(formula = Sp ~ ., data = Iris, prior = c(1, 1, 1)/3, subset = train)
 $ terms  :Classes 'terms', 'formula' length 3 Sp ~ Sepal.L. + Sepal.W. + Petal.L. + Petal.W.
  .. ..- attr(*, "variables")= language list(Sp, Sepal.L., Sepal.W., Petal.L., Petal.W.)
  .. ..- attr(*, "factors")= int [1:5, 1:4] 0 1 0 0 0 0 0 1 0 0 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:5] "Sp" "Sepal.L." "Sepal.W." "Petal.L." ...
  .. .. .. ..$ : chr [1:4] "Sepal.L." "Sepal.W." "Petal.L." "Petal.W."
  .. ..- attr(*, "term.labels")= chr [1:4] "Sepal.L." "Sepal.W." "Petal.L." "Petal.W."
  .. ..- attr(*, "order")= int [1:4] 1 1 1 1
  .. ..- attr(*, "intercept")= int 1
  .. ..- attr(*, "response")= int 1
  .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv
  .. ..- attr(*, "predvars")= language list(Sp, Sepal.L., Sepal.W., Petal.L., Petal.W.)
  .. ..- attr(*, "dataClasses")= Named chr [1:5] "factor" "numeric" "numeric" "numeric" ...
  .. .. ..- attr(*, "names")= chr [1:5] "Sp" "Sepal.L." "Sepal.W." "Petal.L." ...
 $ xlevels: Named list()
 - attr(*, "class")= chr "lda"

I think your question was flawed ... OK, maybe not flawed but somewhat misleading at the very least. 我认为你的问题存在缺陷......好吧,也许没有缺陷,但至少有些误导。 The discriminant function(s) refers to distances between groups, so there is no function associated with a single group but rather a function that describes the distances between any two group centroids. 判别函数是指组之间的距离,因此没有与单个组相关联的函数,而是描述任何两个组质心之间的距离的函数。 I just answered a more recent question and placed an example of calculating a score function using the iris dataset and using it to label cases in a 2d plot of predictors. 我刚刚回答了一个更近期的问题并提供了一个使用虹膜数据集计算得分函数的示例,并使用它来标记预测变量的第二个图中的案例。 In the case of a 2 group analysis the function will be greater than zero for one group and less than zero for the other group. 在2组分析的情况下,对于一个组,该函数将大于零,对于另一个组,该函数将小于零。

There isn't a built-in way to get the information I needed, so I wrote a function to do it: 没有内置的方法来获取我需要的信息,所以我写了一个函数来做到这一点:

ty.lda <- function(x, groups){
  x.lda <- lda(groups ~ ., as.data.frame(x))

  gr <- length(unique(groups))   ## groups might be factors or numeric
  v <- ncol(x) ## variables
  m <- x.lda$means ## group means

  w <- array(NA, dim = c(v, v, gr))

  for(i in 1:gr){
    tmp <- scale(subset(x, groups == unique(groups)[i]), scale = FALSE)
    w[,,i] <- t(tmp) %*% tmp
  }

  W <- w[,,1]
  for(i in 2:gr)
    W <- W + w[,,i]

  V <- W/(nrow(x) - gr)
  iV <- solve(V)

  class.funs <- matrix(NA, nrow = v + 1, ncol = gr)
  colnames(class.funs) <- paste("group", 1:gr, sep=".")
  rownames(class.funs) <- c("constant", paste("var", 1:v, sep = "."))

  for(i in 1:gr) {
    class.funs[1, i] <- -0.5 * t(m[i,]) %*% iV %*% (m[i,])
    class.funs[2:(v+1) ,i] <- iV %*% (m[i,])
  }

  x.lda$class.funs <- class.funs

  return(x.lda)
}

This code follows the formulas in Legendre and Legendre's Numerical Ecology (1998), page 625, and matches the results of the worked example starting on page 626. 此代码遵循勒让德和勒让德的数字生态学(1998),第625页中的公式,并匹配从第626页开始的工作示例的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM