简体   繁体   中英

Gaussian Process Model with Noise: ValueError: nugget must be either a scalar or array of length n_samples

I am building a Categorization model using a Gaussian Process with Noise - I don't understand why it is failing with a Value Error

I have a data set with about 10% labeled as a target of 1 or 0. I am trying to predict the probability of the other 90% to be 1.

I have used sklearn to split the labeled set into a training and a test set.

X is the feature_training and it is an np.array X.shape (54,9)

y is feature_target and it is is an np.array y.shape (54,1)

both are float and noise is calculated as:

dy = 0.5 + 1.0 * np.random.random(y.shape)
noise = np.random.normal(0, dy)
y = (y + noise)

y.shape
(54,1)

nugget is of type numpy.ndarray and shape (54,1)

In the gaussian process model I am using -

gp = GaussianProcess(corr='squared_exponential', theta0=1e-1,
                 thetaL=1e-3, thetaU=1,
                 nugget=(dy / y) ** 2,
                 random_start=100)

gp.fit(X, y) 

fails because: ValueError: nugget must be either a scalar or array of length n_samples

It seems like X, y, nugget are all of type numpy.ndarray and of the correct shape. I think the nugget is of length n_samples (54) so it should be of equivalent length.

Is there something obvious that I am missing?

Your y needs to be a vector of shape (n,) not an a array of shape (n,1) . You can fix this with

y = y.reshape((len(y),)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM