简体   繁体   中英

First time using neuralnet R package with SPECT Heart Data Set

So I'm trying out the neuralnet package to understand its usage and possible implementations.

I'm workin on the SPECTF Hear Data Set available here: https://archive.ics.uci.edu/ml/machine-learning-databases/spect/SPECTF.test

The variable I'm interested in predicting is in the first column. Actually I merged SPECTF.test and SPECTF.train and randomly split them again in R in test_ and train_ (all the variables scaled). Thit is how they look like:

> str(train_)
'data.frame':   200 obs. of  45 variables:
$ V1 : num  1 1 0 1 1 1 1 1 1 1 ...
$ V2 : num  0.783 0.75 0.733 0.767 0.75 ...
$ V3 : num  0.75 0.633 0.6 0.783 0.7 ...
$ V4 : num  0.636 0.886 0.795 0.841 0.545 ...
...
$ V45: num  0.71 0.855 0.797 0.913 0.754 ...

> str(test_)
'data.frame':   67 obs. of  45 variables:
$ V1 : num  0 0 0 0 0 0 0 0 0 0 ...
$ V2 : num  0.583 0.6 0.6 0.633 0.683 ...
$ V3 : num  0.7 0.617 0.783 0.917 0.617 ...
$ V4 : num  0.955 0.705 0.705 0.75 0.727 ...
...
$ V45: num  0.899 0.812 0.899 0.797 0.797 ...

Following a tutorial on R-blogging I set up the neural network as follows:

n <- names(train_)
f <- as.formula(paste("train_[,1] ~", paste(n[!n %in% "train_[,1]"], collapse = " + ")))
nn <- neuralnet(f,data=train_,hidden=2,linear.output=T)

And up to this point it works smoothly, then I try to make the prediction for the test data:

pr.nn <- compute(nn,test_[,2:45])

But it gives me back this error, which I don't understand how to solve:

> pr.nn <- compute(nn,test_[,2:45])
Error in neurons[[i]] %*% weights[[i]] : non-conformable arguments

Thank you very much for your help and all your work! This community is an excellent resource!

Your problem lies in your creation of your formula f. Let's look at what formula you're creating

f <- as.formula(paste("train_[,1] ~", paste(n[!n %in% "train_[,1]"], collapse = " + ")))
> f
train_[, 1] ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + 
X11 + X12 + X13 + X14 + X15 + X16 + X17 + X18 + X19 + X20 + 
X21 + X22 + X23 + X24 + X25 + X26 + X27 + X28 + X29 + X30 + 
X31 + X32 + X33 + X34 + X35 + X36 + X37 + X38 + X39 + X40 + 
X41 + X42 + X43 + X44 + X45

This looks good at first glance. However, if you look closely, you see that X1 is included in your formula as both the response and the predictor, which is going to be a problem.

If you want to select every variable besides X1, there's an easier way:

f <- as.formula(paste(paste0(n[1]," ~ "),paste(n[-1], collapse = " + ")))
> f
X1 ~ X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X11 + X12 + 
X13 + X14 + X15 + X16 + X17 + X18 + X19 + X20 + X21 + X22 + 
X23 + X24 + X25 + X26 + X27 + X28 + X29 + X30 + X31 + X32 + 
X33 + X34 + X35 + X36 + X37 + X38 + X39 + X40 + X41 + X42 + 
X43 + X44 + X45

You've already created a vector of variable names, so using n[1] will grab the first variable name, and n[-1] will grab everything else.

I tested this using just some dummy data and did not flag any errors when using compute() so this should solve your issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM