简体   繁体   English

ValueError:未知的url类型:sklearn中的0.0错误

[英]ValueError: unknown url type: 0.0 error in sklearn

I have a simple script that tries to convert a csv data file into a form that the tool svm_light can accept. 我有一个简单的脚本,试图将csv数据文件转换为svm_light工具可以接受的形式。 Here's the code: 这是代码:

    import csv
import sys
import numpy as np
from sklearn.cross_validation import train_test_split

def svm_light_conversion(row):
    conv_row = row[len(row) - 1] + ' '

    for i in xrange(len(row) - 1):
        conv_row = conv_row + str(i + 1) + ':' + str(row[i]) + ' '

    return conv_row

def reaData(inputfile):

    with open(inputfile, 'r') as inFile: 
        reader = csv.reader(inFile)
        my_content = list(reader)

    my_content = my_content[0:len(my_content) - 1]

    return my_content

def converToSVMLiteFormat(outputfile, train, test):

    train_file = outputfile + '_train.dat'
    test_file = outputfile + '_test.dat'
    #svm_light conversion for training data
    with open(train_file, 'wb') as txtfile:
        for i in xrange(len(train)):
            converted_row = svm_light_conversion(train[i]) + '\n'

            txtfile.write(converted_row)

    txtfile.close()

    #svm_light conversion for test data#
    with open(test_file, 'wb') as txtfile:
        for i in xrange(len(test)):
            converted_row = svm_light_conversion(test[i]) + '\n'

            txtfile.write(converted_row)

    txtfile.close()



def main():

    inputfile = sys.argv[1]
    outputfile = sys.argv[2]

    content = reaData(inputfile)

    train, test = train_test_split(content, train_size = 0.8) #split data
    converToSVMLiteFormat(outputfile, train, test)



if __name__ == "__main__":
    main()

It was working absolutely fine before, but now suddenly its giving the error: 之前它工作得非常好,但是现在突然出现了错误:

(env)fieldsofgold@fieldsofgold-VirtualBox:~/new$ python prac.py data.csv outt
Traceback (most recent call last):
  File "prac.py", line 4, in <module>
    from sklearn.cross_validation import train_test_split
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/sklearn/cross_validation.py", line 32, in <module>
    from .metrics.scorer import check_scoring
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/sklearn/metrics/__init__.py", line 7, in <module>
    from .ranking import auc
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/sklearn/metrics/ranking.py", line 30, in <module>
    from ..utils.stats import rankdata
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/sklearn/utils/stats.py", line 2, in <module>
    from scipy.stats import rankdata as _sp_rankdata
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/scipy/stats/__init__.py", line 338, in <module>
    from .stats import *
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/scipy/stats/stats.py", line 189, in <module>
    from . import distributions
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/scipy/stats/distributions.py", line 10, in <module>
    from ._distn_infrastructure import (entropy, rv_discrete, rv_continuous,
  File "/home/fieldsofgold/new/env/local/lib/python2.7/site-packages/scipy/stats/_distn_infrastructure.py", line 44, in <module>
    from new import instancemethod
  File "/home/fieldsofgold/new/new.py", line 10, in <module>
    response2 = urllib2.urlopen(row[12])
  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 396, in open
    protocol = req.get_type()
  File "/usr/lib/python2.7/urllib2.py", line 258, in get_type
    raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: 0.0

Could anyone please help me parse the error? 谁能帮我解析错误? It seems like the error occurs somewhere in sklearn but I do not understand completely what could be going wrong. 似乎该错误发生在sklearn中的某处,但我不完全了解可能出了什么问题。 Thanks. 谢谢。

If you follow the traceback, from the line in your file 如果您跟踪回溯,请从文件中的行开始

from sklearn.cross_validation import train_test_split

you create a cascade of imports. 您创建了一系列导入。 But if you read later on in the traceback, you'll see this 但是,如果您稍后在回溯中阅读,您将看到

    from new import instancemethod
  File "/home/fieldsofgold/new/new.py", line 10, in <module>

There is a module somewhere in Python called new.py . Python中某个地方有一个名为new.py的模块。 However, you have also created a module called new.py in your current directory. 但是,您还在当前目录中创建了一个名为new.py的模块。 Because of the priority of imports , Python will first look for the module in the current working directory. 由于导入优先级 ,Python首先将在当前工作目录中查找模块。 If it doesn't find it, it will try other places, according to 据称,如果找不到,它将尝试其他地方。

>>> import sys
>>> sys.path

So basically Python imports the wrong new.py and it all snowballs from there. 因此,基本上,Python会导入错误的new.py以及所有错误。 In order to avoid the problem, simply rename your new folder and the new.py file to something else. 为了避免出现此问题,只需将new文件夹和new.py文件重命名为其他名称。 Also, make sure you delete the new.pyc file which has been created, because its existence is enough to attempt the import from there. 另外,请确保删除已创建的new.pyc文件,因为它的存在足以尝试从此处导入。

Just for the curious ones, this is the content of the file, located in .../Python27/Lib/ on Windows. 只是出于好奇,这就是文件的内容,位于Windows上的... / Python27 / Lib /中。

"""Create new objects of various types.  Deprecated.
This module is no longer required except for backward compatibility.
Objects of most types can now be created by calling the type object.
"""
from warnings import warnpy3k
warnpy3k("The 'new' module has been removed in Python 3.0; use the 'types' "
            "module instead.", stacklevel=2)
del warnpy3k

from types import ClassType as classobj
from types import FunctionType as function
from types import InstanceType as instance
from types import MethodType as instancemethod
from types import ModuleType as module

from types import CodeType as code

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM