简体   繁体   中英

Impossible ValueError. Value not in list while using numpy.unique

My code throws a 'Y[i] is not in list' ValueError, even though this list is composed of the unique values in Y. I printed the list, the types of the list and the type of Y[i] but found no solution. Also, the error occurs irregularly.

To provide some context: I am trying to write a simple piece of code that checks if my K-Means classifier classified correctly. Because the cluster means are unlabeled ints, I want my output to be a matrix of integers such that C[h][y] represents the amount of times that my model classifies X[i] as h, while the actual label was y. Because the given labels are not necessarily integers, I try to assign them integers by creating a list of possible labels (V) and use the index of this list rather than the label itself.

The code (including debug prints):

    def classify(func, D):
        X = D[0]
        Y = D[1]
        V = list(np.unique(Y))    # <- V contains all values of Y
        print(V)
        print(type(V[0]),type(V[1]),type(V[2]))
        C = [V]
        for i in range(len(Y)):
            h = func(X[i])
            while len(C) < h+1:
                C.append(np.zeros(len(V)))
            if not Y[i] in V:
                print(type(Y[i]))
            y = V.index(Y[i])     # <- V does not contain Y[i]?
            C[h][y] += 1
        return np.array(C)

The output:

    [1.0, 2.0, 3.0]
    <class 'numpy.float64'> <class 'numpy.float64'> <class 'numpy.float64'>
    <class 'numpy.float64'>
    Traceback (most recent call last):
      File "leren6.py", line 38, in <module>
        main()
      File "leren6.py", line 18, in main
        C = classify(model, Data)
      File "leren6.py", line 33, in classify
        y = V.index(Y[i])
    ValueError: 3.0 is not in list

If you can fix this, you're officially awesome.

There isn't much information given (example function arguments which reproduce the bug would be helpful next time), but I suspect that this line is responsible:

C = [V]

The issue is that C[0] becomes another name for V. Hence, whenever the line C[h][y] += 1 is executed when h = 0, one item in V gets clobbered. Hence, while V may have started as [np.float64(1.0), np.float64(2.0), np.float64(3.0)] , it may not stay that way as it gets eroded through the loop.

may be a rounding problem, you're working with non integer values. Try to replace 1.0, 2.0... whith 1, 2... and see what happens

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM