简体   繁体   中英

python - linear regression - image

I am trying to wrap my head around on machine learning within python. i have been working with the following example ( http://scikit-learn.org/stable/auto_examples/plot_multioutput_face_completion.html#example-plot-multioutput-face-completion-py ) with the code example per below.

i would love to test / validat my understanding on the inner working of the linear regression. The aim is to predict the lower missing half of a picture by looking at the known upper half of a picture. There were originally 300 64*64 images (4096 pixels). The independent variable X is a 300*2048 matrix (300 pictures, 2048 pixels (upper half of those pictures). The dependent variable is also a 300*2048 matrix (lower half of the pictures). It seems that the coefficient matrix is a 2048*2048 matrix. Am i right in my understanding that:

  • that the prediction for a single pixel of y (eg picture 1, most uppper left pixel) is performed by the multiplicatoin of all 2048 pixels in the upper half of picture 1 times the set of regression coefficients - and that thus each missing pixel in the lower half is estimated by taking into account all 2048 pixels of that specific image?

  • that the regression coefficients are pixel dependent (each y pixel has different set of 2048 regression coefficents), and that these coefficients are estimated by finding the OLS fit for that specific pixel location across the identical pixel location across the 300 images available?

I might very well be confused by the matrices - so please correct me if i am wrong. many thanks. W

print(__doc__)

import numpy as np
import matplotlib.pyplot as plt

from sklearn.datasets import fetch_olivetti_faces
from sklearn.utils.validation import check_random_state

from sklearn.ensemble import ExtraTreesRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import RidgeCV

# Load the faces datasets
data = fetch_olivetti_faces()
targets = data.target

data = data.images.reshape((len(data.images), -1))
train = data[targets < 30]
test = data[targets >= 30]  # Test on independent people

# Test on a subset of people
n_faces = 5
rng = check_random_state(4)
face_ids = rng.randint(test.shape[0], size=(n_faces, ))
test = test[face_ids, :]

n_pixels = data.shape[1]
X_train = train[:, :np.ceil(0.5 * n_pixels)]  # Upper half of the faces
y_train = train[:, np.floor(0.5 * n_pixels):]  # Lower half of the faces
X_test = test[:, :np.ceil(0.5 * n_pixels)]
y_test = test[:, np.floor(0.5 * n_pixels):]

# Fit estimators
ESTIMATORS = {
    "Extra trees": ExtraTreesRegressor(n_estimators=10, max_features=32,
                                       random_state=0),
    "K-nn": KNeighborsRegressor(),
    "Linear regression": LinearRegression(),
    "Ridge": RidgeCV(),
}

y_test_predict = dict()
for name, estimator in ESTIMATORS.items():
    estimator.fit(X_train, y_train)
    y_test_predict[name] = estimator.predict(X_test)

You're right.

There are 4096 pixels in each image. Each output pixel in the test set is a linear combination of the training coefficients for that pixel , and the 2048 input pixels from the test set.

If you look at the sklearn Linear Regression documentation , you'll see that the coefficients of multi-target regression in are of the shape (n_targets, n_features) (2048 targets, 2048 features)

In [24]: ESTIMATORS['Linear regression'].coef_.shape
Out[24]: (2048, 2048)

Under the hood, it's calling scipy.linalg.lstsq , so it's important to note that there's no "information sharing" between the coefficients, in the sense that each output is a separate linear combination of all 2048 of the input pixels.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM