简体   繁体   中英

Python: equality for Nan in a list?

I just want to figure out the logic behind these results:

>>>nan = float('nan')
>>>nan == nan
False 
# I understand that this is because the __eq__ method is defined this way
>>>nan in [nan]
True 
# This is because the __contains__ method for list is defined to compare the identity first then the content?

But in both cases I think behind the scene the function PyObject_RichCompareBool is called right? Why there is a difference? Shouldn't they have the same behaviour?

But in both cases I think behind the scene the function PyObject_RichCompareBool is called right? Why there is a difference? Shouldn't they have the same behaviour?

== never calls PyObject_RichCompareBool on the float objects directly, floats have their own rich_compare method(called for __eq__ ) that may or may not call PyObject_RichCompareBool depending on the the arguments passed to it.

 /* Comparison is pretty much a nightmare.  When comparing float to float,
 * we do it as straightforwardly (and long-windedly) as conceivable, so
 * that, e.g., Python x == y delivers the same result as the platform
 * C x == y when x and/or y is a NaN.
 * When mixing float with an integer type, there's no good *uniform* approach.
 * Converting the double to an integer obviously doesn't work, since we
 * may lose info from fractional bits.  Converting the integer to a double
 * also has two failure modes:  (1) a long int may trigger overflow (too
 * large to fit in the dynamic range of a C double); (2) even a C long may have
 * more bits than fit in a C double (e.g., on a a 64-bit box long may have
 * 63 bits of precision, but a C double probably has only 53), and then
 * we can falsely claim equality when low-order integer bits are lost by
 * coercion to double.  So this part is painful too.
 */

static PyObject*
float_richcompare(PyObject *v, PyObject *w, int op)
{
    double i, j;
    int r = 0;

    assert(PyFloat_Check(v));
    i = PyFloat_AS_DOUBLE(v);

    /* Switch on the type of w.  Set i and j to doubles to be compared,
     * and op to the richcomp to use.
     */
    if (PyFloat_Check(w))
        j = PyFloat_AS_DOUBLE(w);

    else if (!Py_IS_FINITE(i)) {
        if (PyInt_Check(w) || PyLong_Check(w))
            /* If i is an infinity, its magnitude exceeds any
             * finite integer, so it doesn't matter which int we
             * compare i with.  If i is a NaN, similarly.
             */
            j = 0.0;
        else
            goto Unimplemented;
    }
...

On the other hand the list_contains directly calls PyObject_RichCompareBool on the items hence you get True in the second case.


Note that this is true only for CPython, PyPy's list.__contains__ method only seems to be comparing the items by calling their __eq__ method:

$~/pypy-2.4.0-linux64/bin# ./pypy
Python 2.7.8 (f5dcc2477b97, Sep 18 2014, 11:33:30)
[PyPy 2.4.0 with GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>> nan = float('nan')
>>>> nan == nan
False
>>>> nan is nan
True
>>>> nan in [nan]
False

You are right in saying that PyObject_RichCompareBool is called, see the list_contains function in listobject.c .

The docs say that:

This is the equivalent of the Python expression o1 op o2, where op is the operator corresponding to opid.

However that does not appear to be entirely correct.

In the cpython source we have this part:

int
PyObject_RichCompareBool(PyObject *v, PyObject *w, int op)
{
    PyObject *res;
    int ok;

    /* Quick result when objects are the same.
       Guarantees that identity implies equality. */
    if (v == w) {
        if (op == Py_EQ)
            return 1;
        else if (op == Py_NE)
            return 0;
    }

in this case since the objects are the same, we have equality.

Mathematically, comparing infinity to infinity does not make sense . Thats why equality is not defined for nan .

For the case nan in [nan] , immutable variables are referenced. But be careful::

>>> nan is nan
True

>>> float('nan') is float('nan')
False

In the first case the immutable variable is referenced. In the second one, two different floats are created and compared.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM