[英]How to increase Logistic Regression score on Sklearn PCA?
我想在Lenet和PCA之间对图像识别进行比较,所以我使用了德国交通信号基准和Sklearn PCA模块,但是当我使用Logistic回归测试它时,得分并没有高于6%,无论我怎样尝试。
我尝试修改了交互次数和预处理次数(使用标准化和均衡),但它仍然无法正常工作
这些文件由Pickle由三个档案加载:
train.p, with shape of (34799, 32, 32, 3)
test.p, with shape of (12630, 32, 32, 3)
valid.p, with shape of (4410, 32, 32, 3)
每个都带有标签,如y_train,y_test和y_valid所示。 这是代码的相关部分:
def gray_scale(image):
"""
Convert images to gray scale.
Parameters:
image: An np.array compatible with plt.imshow.
"""
return cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
def preprocess2(data):
n_training = data.shape
gray_images = np.zeros((n_training[0], n_training[1], n_training[2]))
for i, img in enumerate(data):
gray_images[i] = gray_scale(img)
gray_images = gray_images[..., None]
return gray_images
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
pca = PCA(0.95)
X_train_preprocess = preprocess2(X_train)
#Removing one dimension (34799,32,32,1) to (34799,32,32)
X_train_preprocess = X_train_preprocess.reshape(34799,32,32)
nsamples, nx, ny = X_train_preprocess.shape
X_train_preprocess = X_train_preprocess.reshape((nsamples,nx*ny))
X_test_preprocess = preprocess2(X_test)
#Removing one dimension (34799,32,32,1) to (12630,32,32)
X_test_preprocess = X_test_preprocess.reshape(12630,32,32)
n2samples, n2x, n2y = X_test_preprocess.shape
X_test_preprocess = X_test_preprocess.reshape((n2samples,n2x*n2y))
print(X_train_preprocess.shape)
pca.fit(X_train_preprocess)
print(pca.n_components_)
scaler = StandardScaler()
scaler.fit(X_train_preprocess)
X_t_train = scaler.transform(X_train_preprocess)
X_t_test = scaler.transform(X_test_preprocess)
X_t_train = pca.transform(X_t_train)
X_t_test = pca.transform(X_t_test)
from sklearn.linear_model import LogisticRegression
logisticRegr = LogisticRegression(solver = 'lbfgs', max_iter = 5000)
logisticRegr.fit(X_t_train, y_train)
print('score', logisticRegr.predict(X_t_test[0:10]))
print('score', logisticRegr.score(X_t_test, y_test))
结果如下:
(34799, 1024)
62
/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/logistic.py:469: FutureWarning: Default multi_class will be changed to 'auto' in 0.22. Specify the multi_class option to silence this warning.
"this warning.", FutureWarning)
score [ 1 2 10 10 13 10 25 1 1 4]
score 0.028820269200316707
所以我想看看你们是否可以告诉我我做错了什么,以及如何才能使这项工作正常进行
你在图像识别中得到了2d数据,最好用cnn网络来表示高维关系
相关链接: 用sklearn神经网络中的图像训练CNN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.