简体   繁体   中英

Indexing variables with lapply and purrr::map

I have this data.frame :

dataset=structure(list(var1 = c(28.5627505742013, 22.8311421908438, 
95.2216156944633, 
43.9405107684433, 97.11211245507, 48.4108281508088), var2 = c(32.9009465128183, 
54.1136392951012, 69.3181485682726, 70.2100433968008, 44.0986660309136, 
62.8759404085577), var3 = c(89.6971945464611, 67.174579706043, 
37.0924087055027, 87.7977314218879, 29.3221596442163, 37.5143952667713
), var4 = c(41.5336912125349, 98.2095112837851, 80.7970978319645, 
91.1278881691396, 66.4086666144431, 69.2618868127465), var5 = c(33.9312525652349, 
88.1815139763057, 98.4453701227903, 25.0217059068382, 41.1195872165263, 
37.0983888953924), var6 = c(39.813664201647, 80.6405956856906, 
30.0273275375366, 34.6203793399036, 96.5195455029607, 44.5830867439508
), kmeans = structure(c(2L, 1L, 3L, 1L, 3L, 1L), .Label = c("1", 
"2", "3"), class = "factor")), .Names = c("var1", "var2", "var3", 
"var4", "var5", "var6", "kmeans"), row.names = c(NA, 6L), class = "data.frame")

Whit lapply and purrr::map in data.frame the result is ok. See:

lapply(dataset[c(1:6)],shapiro.test)

purrr::map(dataset[c(1:6)],shapiro.test)

Ok. Now, I want apply this into a list:

create the list ( mylist ):

set.seed(1234)
for(i in 1:6){
names<-paste0('var',i)
assign(names,runif(30,20,100))
}

dataset<-do.call(
cbind.data.frame,
mget(ls(pattern='*va'))
)

cluster<-kmeans(dataset,3)
dataset$kmeans<-as.factor(cluster[['cluster']])
mylist<-split(dataset,dataset$kmeans)
names(mylist)<-paste0('dataset',seq_along(mylist))

create the function ( f ):

f<-function(x){
  apply(x,2,shapiro.test)
}

after, apply this function in lapply and purrr::map :

lapply(mylist[c(1:6)],f)
#Error: is.numeric(x) is not TRUE

purrr::map(mylist[c(1:6)],f)
#Error: is.numeric(x) is not TRUE

try this:

lapply(mylist[c(1:6)],function(x){
    lapply(x,shapiro.test)
})
#Error: is.numeric(x) is not TRUE 

lapply(mylist[c(1:6)],function(x){
  lapply(x,f)
})
#Error in apply(x, 2, shapiro.test) : dim(X) must have a positive length 

mylist[c(1:6)]%>%
  map(~map(.,shapiro.test))
#Error: is.numeric(x) is not TRUE

mylist[c(1:6)]%>%
  map(~map(.,f))
#Error in apply(x, 2, shapiro.test) : dim(X) must have a positive length

What's wrong?

The apply documentation states:

If X is not an array but an object of a class with a non-null dim value (such as a data frame), apply attempts to coerce it to an array via as.matrix if it is two-dimensional (eg, a data frame) or via as.array.

Since the last column of each data frame is a factor (the column cluster ), the as.matrix call will coerce the whole row to a character vector (which is not accepted as an input to shapiro.test )

It will work if you choose only numeric columns within the apply function

f<-function(x){
  apply(x[c(1:6)],2 , shapiro.test)
}

lapply(mylist, f)

Note: Try running as.matrix(dataset[1,]) and as.matrix(dataset[1,c(1:6)]) to see the difference.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM