sci-kit learn：使用 X.reshape(-1, 1) 重塑数据

Question

I'm training a python (2.7.11) classifier for text classification and while running I'm getting a deprecated warning message that I don't know which line in my code is causing it!我正在训练一个用于文本分类的 python (2.7.11) 分类器，在运行时我收到一条已弃用的警告消息，我不知道代码中的哪一行导致了它！ The error/warning.错误/警告。 However, the code works fine and give me the results...但是，代码工作正常并给我结果......

\\AppData\\Local\\Enthought\\Canopy\\User\\lib\\site-packages\\sklearn\\utils\\validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. \\AppData\\Local\\Enthought\\Canopy\\User\\lib\\site-packages\\sklearn\\utils\\validation.py:386：DeprecationWarning：将一维数组作为数据在 0.17 中被弃用，并会在 0.19 中引发 ValueError。 Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.如果您的数据具有单个特征，则使用 X.reshape(-1, 1) 或 X.reshape(1, -1) 如果它包含单个样本来重塑您的数据。

My code:我的代码：

def main():
    data = []
    folds = 10
    ex = [ [] for x in range(0,10)]
    results = []
    for i,f in enumerate(sys.argv[1:]):
        data.append(csv.DictReader(open(f,'r'),delimiter='\t'))
    for f in data:       
        for i,datum in enumerate(f):
            ex[i % folds].append(datum)
    #print ex
    for held_out in range(0,folds):
        l = []
        cor = []
        l_test = []
        cor_test = []
        vec = []
        vec_test = []

        for i,fold in enumerate(ex):
            for line in fold:
                if i == held_out:
                    l_test.append(line['label'].rstrip("\n"))
                    cor_test.append(line['text'].rstrip("\n"))
                else:
                    l.append(line['label'].rstrip("\n"))
                    cor.append(line['text'].rstrip("\n"))

        vectorizer = CountVectorizer(ngram_range=(1,1),min_df=1)
        X = vectorizer.fit_transform(cor)
        for c in cor:        
            tmp = vectorizer.transform([c]).toarray()
            vec.append(tmp[0])
        for c in cor_test:        
            tmp = vectorizer.transform([c]).toarray()
            vec_test.append(tmp[0])

        clf = MultinomialNB()
        clf .fit(vec,l)
        result = accuracy(l_test,vec_test,clf)
        print result

if __name__ == "__main__":
    main()

Any idea which line raises this warning?知道哪一行会引发此警告吗？ Another issue is that running this code with different data sets gives me the same exact accuracy, and I can't figure out what causes this?另一个问题是，用不同的数据集运行这段代码给了我同样的准确度，我不知道是什么原因造成的？ If I want to use this model in another python process, I looked at the documentation and I found an example of using pickle library, but not for joblib.如果我想在另一个python进程中使用这个模型，我查看了文档，我找到了一个使用pickle库的例子，但不是joblib。 So, I tried following the same code, but this gave me errors:所以，我尝试遵循相同的代码，但这给了我错误：

clf = joblib.load('model.pkl') 
pred = clf.predict(vec);

Also, if my data is CSV file with this format: "label \\t text \\n" what should be in the label column in test data?另外，如果我的数据是具有以下格式的 CSV 文件：“label \\t text \\n” 测试数据的标签列中应该包含什么？

Thanks in advance提前致谢

Answer 1

Your 'vec' input into your clf.fit(vec,l).fit needs to be of type [[]] , not just [] .您在clf.fit(vec,l).fit “vec”输入需要是[[]]类型，而不仅仅是[] 。 This is a quirk that I always forget when I fit models.这是我在拟合模型时总是忘记的一个怪癖。

Just adding an extra set of square brackets should do the trick!只需添加一组额外的方括号就可以解决问题！

Answer 2

It's:它的：

pred = clf.predict(vec);

I used this in my code and it worked:我在我的代码中使用了它并且它有效：

#This makes it into a 2d array
temp =  [2 ,70 ,90 ,1] #an instance
temp = np.array(temp).reshape((1, -1))
print(model.predict(temp))

Answer 3

2 solution: philosophy___make your data from 1D to 2D 2 解决方案：哲学___让你的数据从一维到二维

Just add: []只需添加： []
```
 vec = [vec]
```

Reshape your data重塑您的数据

import numpy as np vec = np.array(vec).reshape(1, -1)

Answer 4

If you want to find out where the Warning is coming from you can temporarly promote Warnings to Exceptions .如果您想找出Warning的来源，您可以暂时将Warnings提升为Exceptions 。 This will give you a full Traceback and thus the lines where your program encountered the warning.这会给你一个完整的回溯，因此你的程序遇到警告的行。

with warnings.catch_warnings():
    warnings.simplefilter("error")
    main()

If you run the program from the commandline you can also use the -W flag.如果您从命令行运行程序，您还可以使用-W标志。 More information on Warning-handling can be found in the python documentation .有关警告处理的更多信息可以在python 文档中找到。

I know it is only one part of your question I answered but did you debug your code?我知道这只是我回答的问题的一部分，但是您是否调试了代码？

Answer 5

Since 1D array would be deprecated.因为一维数组将被弃用。 Try passing 2D array as a parameter.尝试将二维数组作为参数传递。 This might help.这可能会有所帮助。

clf = joblib.load('model.pkl') 
pred = clf.predict([vec]);

Answer 6

预测方法需要二维数组，你可以看这个视频，我也找到了准确的时间https://youtu.be/KjJ7WzEL-es?t=2602 。你必须从 [] 更改为 [[]]。

sci-kit learn：使用 X.reshape(-1, 1) 重塑数据

问题描述

6 个解决方案

解决方案1
19 2016-06-29 00:16:33

解决方案2
14 2016-04-01 06:27:15

解决方案3
6 2017-07-23 18:23:25

解决方案4
5 2016-02-03 01:43:21

解决方案5
0 2017-04-30 10:19:25

解决方案6
0 2019-09-21 22:46:23

sci-kit learn：使用 X.reshape(-1, 1) 重塑数据

问题描述

6 个解决方案

解决方案1 19 2016-06-29 00:16:33

解决方案2 14 2016-04-01 06:27:15

解决方案3 6 2017-07-23 18:23:25

解决方案4 5 2016-02-03 01:43:21

解决方案5 0 2017-04-30 10:19:25

解决方案6 0 2019-09-21 22:46:23

解决方案1
19 2016-06-29 00:16:33

解决方案2
14 2016-04-01 06:27:15

解决方案3
6 2017-07-23 18:23:25

解决方案4
5 2016-02-03 01:43:21

解决方案5
0 2017-04-30 10:19:25

解决方案6
0 2019-09-21 22:46:23