简体   繁体   中英

Removal of an item from a python list, how are items compared (e.g. numpy arrays)?

I am a bit puzzled by the python (2.7) list.remove function. In the documentation of remove it says: "Remove the first item from the list whose value is x. It is an error if there is no such item."

So, I guess here value means that comparison is based on equality (ie == ) and not identity (ie is ). However, can someone explain to me the following behaviour. Apparently, both comparisons are used, but in a rather strange way:

import numpy as np

x = np.array([1,2,3])
mylist = [x, 42, 'test', x] # list containing the numpy array twice
print mylist

This will, of course, print:

[array([1, 2, 3]), 42, 'test', array([1, 2, 3])]

So far so good. But strangely the following code does execute:

mylist.remove(x)
print mylist

giving

[42, 'test', array([1, 2, 3])]

I would expect it to throw an error because numpy arrays do not return a boolean statement but a boolean array. For instance, x == x returns array([ True, True, True], dtype=bool) . Yet, our removal happily executes. However, calling the same statement once more yields the predicted behaviour:

mylist.remove(x)

throws a

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-835a19b5f6a9> in <module>()
----> 1 mylist.remove(x)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

What is going on?

Looking at the source code, list.remove uses the PyObject_RichCompareBool function to compare objects. This function contains the following at the beginning:

/* Quick result when objects are the same.
Guarantees that identity implies equality. */
if (v == w) {
    if (op == Py_EQ)
        return 1;
    else if (op == Py_NE)
        return 0;
}

So it compares object identity first. Only if the objects are different, it proceeds to use the == operator.

In your example, if x is the first object in the list, it will be the same object as the removed value and therefore be regarded as equal by the above function and removed. If anything else is the first object, it will be compared to x with the == operator, which will return a numpy array and cause an error as it can't be converted to a boolean value.

The in operator works in the same way, so x in [x,1] returns True while x in [1,x] raises an error.

The error occurs the second time because testing 42 for identity with x fails and Python falls back to comparing x with the integer 42 using equality ( == ).

mylist.remove(x) gets rid of the first occurrence of x in the list without any hiccups because x is x returns True . The problem is that when the first element of the first is 42, x is 42 returns False so Python tries x == 42 instead.

This equality test returns the array array([False, False, False]) . Unlike native Python objects, NumPy arrays have an ambiguous truth value and an error is raised.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM