简体   繁体   English

内置“in”运算符的Python源代码

[英]Python source code for built-in “in” operator

I am trying to find the implementation of the built-in in operator in the (C) Python source code. 我想找到的执行内置in中(C)Python源代码运营商。 I have searched in the built-in functions source code, bltinmodule.c , but cannot find the implementation of this operator. 我在内置函数源代码bltinmodule.c中搜索过 ,但是找不到这个运算符的实现。 Where can I find this implementation? 我在哪里可以找到这个实现?

My goal is to improve the sub-string search in Python by extending different C implementations of this search, although I am not sure if Python already uses the idea I have. 我的目标是通过扩展此搜索的不同C实现来改进Python中的子字符串搜索,尽管我不确定Python是否已经使用了我的想法。

To find the implementation of any python operator, first find out what bytecode Python generates for it, using the dis.dis function : 要查找任何 python运算符的实现,首先使用dis.dis函数找出Python为其生成的字节码:

>>> dis.dis("'0' in ()")
  1           0 LOAD_CONST               0 ('0')
              2 LOAD_CONST               1 (())
              4 COMPARE_OP               6 (in)
              6 RETURN_VALUE

The in operator becomes a COMPARE_OP byte code. in运算符变为COMPARE_OP字节代码。 Now you can trace how this opcode is being handled in the Python evaluation loop in Python/ceval.c : 现在,您可以在Python/ceval.c的Python评估循环中跟踪如何处理此操作码:

TARGET(COMPARE_OP)
    PyObject *right = POP();
    PyObject *left = TOP();
    PyObject *res = cmp_outcome(oparg, left, right);
    Py_DECREF(left);
    Py_DECREF(right);
    SET_TOP(res);
    if (res == NULL)
        goto error;
    PREDICT(POP_JUMP_IF_FALSE);
    PREDICT(POP_JUMP_IF_TRUE);
    DISPATCH();

cmp_outcome() is defined in the same file , and the in operator is one of the switches: cmp_outcome() 在同一个文件中定义, in运算符是其中一个开关:

case PyCmp_IN:
    res = PySequence_Contains(w, v);
    if (res < 0)
         return NULL;
    break;

A quick grep shows us where PySequence_Contains is defined, in Objects/abstract.c : 快速grep向我们展示了在Objects / abstract.c中定义PySequence_Contains位置:

int
PySequence_Contains(PyObject *seq, PyObject *ob)
{
    Py_ssize_t result;
    PySequenceMethods *sqm = seq->ob_type->tp_as_sequence;
    if (sqm != NULL && sqm->sq_contains != NULL)
        return (*sqm->sq_contains)(seq, ob);
    result = _PySequence_IterSearch(seq, ob, PY_ITERSEARCH_CONTAINS);
    return Py_SAFE_DOWNCAST(result, Py_ssize_t, int);
}

PySequence_Contains thus uses the sq_contains slot on the Sequence object structure or an iterative search otherwise, for Python C objects. 因此,对于Python C对象, PySequence_Contains使用Sequence对象结构上sq_contains或否则使用迭代搜索。

For Python 3 Unicode string objects, this slot is implemented as PyUnicode_Contains in Objects/unicodeobject.c , in Python 2 you also want to check out string_contains in Objects/stringobject.c . 对于Python 3的Unicode字符串对象,该插槽被实现为PyUnicode_Contains的对象/ unicodeobject.c ,在Python 2,你也想看看string_contains在对象/ stringobject.c Basically just grep for sq_contains in the Objects/ subdirectory for the various implementations by the different Python types. 基本上只是grep for Objects /子目录中的sq_contains ,用于不同Python类型的各种实现。

For generic python objects, it's interesting to note that Objects/typeobject.c defers this to the __contains__ method on custom classes, if so defined. 对于通用python对象,有趣的是注意到Objects / typeobject.c将此推迟到自定义类的__contains__方法,如果这样定义的话。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM