R : cannot run partial least square regression on more than one descriptor

Question

I generated a csv table "T.CSV" :

"system","response","NIR.a","NIR.b"
 1,1,2,3
 2,4,5,6
 3,7,8,9

for which plsr succeeds for one descriptor but fails for multiple descriptors :

> library(pls)
> j <- read.csv(file="T.CSV",header=T,sep=",")
> head(j)
system response NIR.a NIR.b
1      1        1     2     3
2      2        4     5     6
3      3        7     8     9
> mod <- plsr(response ~ NIR.a , data = j ,  ncomp=1 )
> mod <- plsr(response ~ NIR , data = j ,  ncomp=1 )
Error in eval(expr, envir, enclos) : object 'NIR' not found

However, if I load the "oliveoil" example of the pls package, regression works with more than one descriptor :

> data(oliveoil)
> head(oliveoil)
chemical.Acidity chemical.Peroxide chemical.K232 chemical.K270 chemical.DK
G1             0.73              12.7         1.9           0.139       0.003
G2             0.19              12.3         1.678         0.116      -0.004
G3             0.26              10.3         1.629         0.116      -0.005
G4             0.67              13.7         1.701         0.168      -0.002
G5             0.52              11.2         1.539         0.119      -0.001
I1             0.26              18.7         2.117         0.142       0.001
sensory.yellow sensory.green sensory.brown sensory.glossy sensory.transp
G1           21.4          73.4          10.1           79.7           75.2
G2           23.4          66.3           9.8           77.8           68.7
G3           32.7          53.5           8.7           82.3           83.2
G4           30.2          58.3          12.2           81.1           77.1
G5           51.8          32.5             8           72.4           65.3
I1           40.7          42.9          20.1           67.7           63.5
sensory.syrup
G1          50.3
G2          51.7
G3          45.4
G4          47.8
G5          46.5
I1          52.2

Here pls works for multiple descriptors :

> mod <- plsr(chemical ~ sensory , data = oliveoil ,  ncomp=1 )
>

Can you please advise on where I've been wrong in my 1st table ?

Thanks in advance !

Answer 1

If we look at the str(oliveoil) , the 'sensory' is a matrix with n columns. So, to use the formula in that way, the "NIR" should be also a matrix inside a data.frame

j1 <- j[1:2]
j1["NIR"] <- as.matrix(setNames(j[3:4], letters[1:2]))
mod <- plsr(response ~ NIR , data = j1 ,  ncomp=1 )
str(mod)
#List of 19
# $ coefficients   : num [1:2, 1, 1] 0.5 0.5
# ..- attr(*, "dimnames")=List of 3
# .. ..$ : chr [1:2] "a" "b"
# .. ..$ : chr "response"
# .. ..$ : chr "1 comps"
# ----

data

j <- structure(list(system = 1:3, response = c(1L, 4L, 7L),
 NIR.a = c(2L, 
 5L, 8L), NIR.b = c(3L, 6L, 9L)), .Names = c("system", "response", 
 "NIR.a", "NIR.b"), class = "data.frame", row.names = c(NA, -3L))

Answer 2

In your command, mod <- plsr(response ~ NIR , data = j , ncomp=1 ) , on the RHS of the ~ make sure the name of explanatory variables match exactly to the column names in the data (in terms of spelling, and upper/lower case). In R's response to head I notice there is no column called NIR . But there is one called NIR.a and one called NIR.b . Have you checked if replacing NIR with NIR.a or NIR.b works?

R : cannot run partial least square regression on more than one descriptor

Question

2 answers

solution1
1 ACCPTED 2016-04-18 03:39:47

data

solution2
0 2016-04-18 03:33:36

R : cannot run partial least square regression on more than one descriptor

Question

2 answers

solution1 1 ACCPTED 2016-04-18 03:39:47

data

solution2 0 2016-04-18 03:33:36

solution1
1 ACCPTED 2016-04-18 03:39:47

solution2
0 2016-04-18 03:33:36