简体   繁体   中英

Python: using int() on a string that is not an integer literal

Note: I was using the wrong source file for my data - once that was fixed, my issue was resolved. It turns out, there is no simple way to use int(..) on a string that is not an integer literal.

This is an example from the book "Machine Learning In Action", and I cannot quite figure out what is wrong. Here's some background:

from numpy import as *

def file2matrix(filename):
    fr = open(filename)
    numberOfLines = len(fr.readlines())
    returnMat = zeros((numberOfLines,3))
    classLabelVector = []
    fr = open(filename)
    index = 0
    for line in fr.readlines():
        line = line.strip()
        listFromLine = line.split('\t')
        returnMat[index,:] = listFromLine[0:3]
        classLabelVector.append(int(listFromLine[-1])) # Problem here.
        index += 1
    return returnMat,classLabelVector

The .txt file is as follows:

40920   8.326976    0.953952    largeDoses
14488   7.153469    1.673904    smallDoses
26052   1.441871    0.805124    didntLike
75136   13.147394   0.428964    didntLike
38344   1.669788    0.134296    didntLike
...

I am getting an error on the line classLabelVector.append(int(listFromLine[-1])) because, I believe, int(..) is trying to parse over a String (ie "largeDoses" ) that is a not a literal integer. Am I missing something?

I looked up the documentation for int() , but it only seems to parse numbers and integer literals:

http://docs.python.org/2/library/functions.html#int

Also, an excerpt from the book explains this section as follows:

Finally, you loop over all the lines in the file and strip off the return line character with line.strip(). Next, you split the line into a list of elements delimited by the tab character: '\\t'. You take the first three elements and shove them into a row of your matrix, and you use the Python feature of negative indexing to get the last item from the list to put into classLabelVector. You have to explicitly tell the interpreter that you'd like the integer version of the last item in the list, or it will give you the string version. Usually, you'd have to do this, but NumPy takes care of those details for you.

strings like "largeDoses" could not be converted to integers. In folder Ch02 of that code project , you have two data files, use the second one datingTestSet2.txt instead of loading the first

您可以使用ast.literal_eval并以格式错误的字符串捕获异常ValueError(通过int('9.4')将引发异常)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM