遍历列表时出错：“ [[-。data.frame`（`* tmp *`，，i，value = c（7L，1L，4L，7L，7L，：新列将留下孔……”）中的错误。

Question

I'm trying to write a function that loops through a list in order to run kmeans clustering on only specific columns of a dataset. 我试图编写一个遍历列表的函数，以便仅在数据集的特定列上运行kmeans聚类。 I want the output to be a matrix/dataframe of the cluster membership of each observation when kmeans is run on each set of columns. 当kmeans在每组列上运行时，我希望输出为每个观察值的群集成员的矩阵/数据框。

Here's a mock dataset and the function I came up with (I'm new to R--sorry if it's shaky) 这是一个模拟数据集和我想出的功能（我是R的新手，如果太不稳定，很抱歉）

set.seed(123)
mydata <- data.frame(a = rnorm(100,0,1), b = rnorm(100,0,1), c = 
rnorm(100,0,1), d = rnorm(100,0,1), e = rnorm(100,0,1)) 

set.seed(123)
my.kmeans <- function(data,k,...) {
    clusters <- data.frame(matrix(nrow = nrow(data), ncol = 
    length(list(...)))) # set up dataframe for clusters
    for(i in list(...)) {
        kmeans <- kmeans(data[,i],centers = k)
        clusters[,i] <- kmeans$cluster
    }
    colnames(clusters) <- list(...)
    clusters
}

My question is: this seems to work when I only ask it to use consecutive columns, but not when I ask it to skip around some. 我的问题是：当我只要求它使用连续的列时，这似乎可行，但是当我要求它跳过某些列时，这似乎不起作用。 For instance, the first of the following works, but the second does not. 例如，以下第一个有效，但第二个无效。 Any idea how I can fix this? 知道我该如何解决吗？

# works how I want 
head(my.kmeans(data = mydata, k = 8, c(1,2), c(2,3), c(1,2,3)))

# doesn't work 
head(my.kmeans(data = mydata, k = 8, c(1,2), c(2,3), c(1,2,5)))

Also, I know people recommend using apply functions and staying away from for loops, but I don't know how to do this with an apply function. 另外，我知道人们建议使用Apply函数，并远离for循环，但是我不知道如何使用Apply函数来做到这一点。 Any advice on that would be much appreciated as well. 对此的任何建议也将不胜感激。

Thanks so much! 非常感谢！

Danny 丹尼

Answer 1

Building on @SatZ's comments, 以@SatZ的评论为基础，

set.seed(123)
mydata <- data.frame(a = rnorm(100,0,1), b = rnorm(100,0,1), c = 
                   rnorm(100,0,1), d = rnorm(100,0,1), e = 
                   rnorm(100,0,1)) 
mylist <- list(c(1,2), c(2,3), c(1,2,5))

set.seed(123)
my.kmeans <- function(data,k,list) {
  clusters <- data.frame(matrix(nrow = nrow(data), ncol = 
                              length(list))) # set up dataframe for 
                              clusters
  for(i in 1:length(list)) {
      kmeans <- kmeans(data[,list[[i]]],centers = k)
      clusters[,i] <- kmeans$cluster
  }
  colnames(clusters) <- list
  clusters
}

head(my.kmeans(data = mydata, k = 8, list = mylist))

遍历列表时出错：“ [[-。data.frame`（`* tmp *`，，i，value = c（7L，1L，4L，7L，7L，：新列将留下孔……”）中的错误。

问题描述

1 个解决方案

解决方案1
1 2018-07-11 19:31:31

遍历列表时出错：“ [[-。data.frame`（`* tmp *`，，i，value = c（7L，1L，4L，7L，7L，：新列将留下孔……”）中的错误。

问题描述

1 个解决方案

解决方案1 1 2018-07-11 19:31:31

解决方案1
1 2018-07-11 19:31:31