简体   繁体   English

为什么我收到数据转换警告?

[英]Why am I getting a data conversion warning?

I am a relative newbie in this area so I would appreciate your help. 我是这方面的新手,所以我很感激你的帮助。 I am playing around with the mnist dataset. 我正在玩mnist数据集。 I took the code from http://g.sweyla.com/blog/2012/mnist-numpy/ but changed "images" to be 2 dimensional so that every image will be a feature vector. 我从http://g.sweyla.com/blog/2012/mnist-numpy/获取了代码,但将“images”更改为2维,以便每个图像都是特征向量。 Then I ran PCA on the data and then SVM and checked the score. 然后我在数据上运行PCA,然后运行SVM并检查分数。 Everything seems to work fine, but I am getting the following warning and I am not sure why. 一切似乎工作正常,但我得到以下警告,我不知道为什么。

"DataConversionWarning: A column-vector y was passed when a 1d array was expected.\
Please change the shape of y to (n_samples, ), for example using ravel()."

I have tried several things but can't seem to get rid of this warning. 我尝试了几件事,但似乎无法摆脱这个警告。 Any suggestions? 有什么建议? Here is the full code (ignore the missing indentations, seems like they got a little messed up copying the code here): 这是完整的代码(忽略缺少的缩进,看起来他们有点混乱在这里复制代码):

import os, struct
from array import array as pyarray
from numpy import append, array, int8, uint8, zeros, arange
from sklearn import svm, decomposition
#from pylab import *
#from matplotlib import pyplot as plt

def load_mnist(dataset="training", digits=arange(10), path="."):
"""
Loads MNIST files into 3D numpy arrays

Adapted from: http://abel.ee.ucla.edu/cvxopt/_downloads/mnist.py
"""

    if dataset == "training":
        fname_img = os.path.join(path, 'train-images.idx3-ubyte')
        fname_lbl = os.path.join(path, 'train-labels.idx1-ubyte')
    elif dataset == "testing":
        fname_img = os.path.join(path, 't10k-images.idx3-ubyte')
        fname_lbl = os.path.join(path, 't10k-labels.idx1-ubyte')
    else:
        raise ValueError("dataset must be 'testing' or 'training'")

    flbl = open(fname_lbl, 'rb')
    magic_nr, size = struct.unpack(">II", flbl.read(8))
    lbl = pyarray("b", flbl.read())
    flbl.close()

    fimg = open(fname_img, 'rb')
    magic_nr, size, rows, cols = struct.unpack(">IIII", fimg.read(16))
    img = pyarray("B", fimg.read())
    fimg.close()

    ind = [ k for k in range(size) if lbl[k] in digits ]
    N = len(ind)

    images = zeros((N, rows*cols), dtype=uint8)
    labels = zeros((N, 1), dtype=int8)
    for i in range(len(ind)):
        images[i] = array(img[ ind[i]*rows*cols : (ind[i]+1)*rows*cols ])
        labels[i] = lbl[ind[i]]

    return images, labels

if __name__ == "__main__":
    images, labels = load_mnist('training', arange(10),"path...")
    pca = decomposition.PCA()
    pca.fit(images)
    pca.n_components = 200
    images_reduced = pca.fit_transform(images)
    lin_classifier = svm.LinearSVC()
    lin_classifier.fit(images_reduced, labels)
    images2, labels2 = load_mnist('testing', arange(10),"path...")
    images2_reduced = pca.transform(images2)
    score = lin_classifier.score(images2_reduced,labels2)
    print score

Thanks for the help! 谢谢您的帮助!

I think scikit-learn expects y to be a 1-D array. 我认为scikit-learn希望y成为一维阵列。 Your labels variable is 2-D - labels.shape is (N, 1). 您的labels变量是2-D - labels.shape是(N,1)。 The warning tells you to use labels.ravel() , which will turn labels into a 1-D array, with a shape of (N,). 警告告诉您使用labels.ravel() ,它将labels转换为一维数组,形状为(N,)。
Reshaping will also work: labels=labels.reshape((N,)) 重塑也将起作用: labels=labels.reshape((N,))
Come to think of it, so will calling squeeze: labels=labels.squeeze() 想一想,所以会调用squeeze: labels=labels.squeeze()

I guess the gotcha here is that in numpy, a 1-D array is different from a 2-D array with one of its dimensions equal to 1. 我想这里的问题是,在numpy中,1-D数组与2-D数组不同,其中一个维度等于1。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么在使用此 xpath 语法时会收到弃用警告? - Why am I getting a deprecated warning when using this xpath syntax? 为什么我收到 python-httpx 未关闭对象警告? - Why am I getting python-httpx Unclosed object warning? 为什么我在运行此脚本时收到 tensorflow 警告? - why am i getting a tensorflow warning when running this script? 我正在使用 openpyxl 在模板中加载数据,当我打开时我在目标文件中收到警告 - I am using openpyxl to load data in template and i am getting warning in the destination file when i open 为什么我会收到这些“警告:目标目录<directory>已经存在。”当我 pip 安装模块时? - Why am I getting these "WARNING: Target Directory <directory> already exists." when I pip install a module? 我收到警告:tensorflow:From in python - I am getting WARNING:tensorflow:From in python Python温度转换MVC样式:为什么我得到“ TypeError:buttonPressed()缺少1个必需的位置参数:&#39;self&#39;” - Python Temperature conversion MVC style: why am I getting “TypeError: buttonPressed() missing 1 required positional argument: 'self'” 为什么我收到“模块'rpy2.robjects.conversion'没有属性'py2rpy'”错误? - Why am I getting “module 'rpy2.robjects.conversion' has no attribute 'py2rpy'” error? 为什么从.csv打印(行)时,为什么在python中出现W292错误(警告)? - Why I am getting W292 error(warning) in python when print(rows) from a .csv? 为什么在使用 tensorflow 时出现警告/错误(使用功能 API 并且未实现错误) - why am I getting warning/error when working with tensorflow (use functional API and not implemented error)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM