简体   繁体   中英

How performant is Python's pattern matching? Is it O(1)?

As we know, Python will support pattern matching in 3.10. Putting usage and syntax aside, I can't seem to find any word about performance in the PEP, or the official tutorial .

So I'm wondering, how performant is it? Is it O(1) or the same as if..elif..else ?

Pattern matching desugars into chained elif statements and isinsance calls. It's definitely not O(1) or close to any kind of lookup table.

Edit: More precisely, in the merged pull request , you can see that the match statement is turned into a chain of opcodes responsible for handling the various patterns. These opcodes are processed in sequence, much like chained elif statements.

In terms of time complexity alone, the two constructs are basically equivalent.

They're very different, though, when considering readability and code structure. As a general rule, if you're matching values or testing several subjects, an if - elif - else ladder will probably be the best tool for the job:

def stringify_response(response: int) -> str:
    if 100 <= response < 200:
        kind = "info"
    elif 200 <= response < 300:
        kind = "success"
    elif 300 <= response < 400:
        kind = "redirect"
    elif 400 <= response < 500:
        kind = "client error"
    elif 500 <= response < 600:
        kind = "server error"
    else:
        kind = "unexpected response"
    return f"{kind} ({response})"

On the other hand, if you're matching the structure of a single subject, then structural pattern matching will probably be the best tool for the job:

from ast import *

def lint_function_def(function_def: FunctionDef) -> None:
    match function_def.body:
        case *_, Assign([Name(na)], e), Return(Name(nr, lineno=l)) if na == nr:
            print(f"line {l}: consider returning {unparse(e)} instead of "
                  f"assigning to {na!r} on line {e.lineno}")

Future Improvements

With that said, one of the benefits of syntactical support for pattern matching is that the compiler and interpreter possess an extraordinary knowledge of context. Because of this, they can make assumptions that ordinary control flow cannot.

Consider this statement (slightly simplified) from PEP 636 :

match command.split():
    case ["north"] | ["go", "north"]:
        # Code for moving north.
    case ["get", obj] | ["pick", "up", obj] | ["pick", obj, "up"]:
        # Code for picking up the given object.

The "current" (CPython 3.10) pattern compiler is quite naive, and compiles it to something like this:

_subject = command.split()
import _collections_abc
if (
    isinstance(_subject, _collections_abc.Sequence)
    and len(_subject) == 1
    and _subject[0] == "north"
):
    # Code for moving north.
elif (
    isinstance(_subject, _collections_abc.Sequence)
    and len(_subject) == 2
    and _subject[0] == "go"
    and _subject[1] == "north"
):
    # Code for moving north.
elif (
    isinstance(_subject, _collections_abc.Sequence)
    and len(_subject) == 2
    and _subject[0] == "get"
):
    obj = _subject[1]
    # Code for picking up the given object.
elif (
    isinstance(_subject, _collections_abc.Sequence)
    and len(_subject) == 3
    and _subject[0] == "pick"
    and _subject[1] == "up"
):
    obj = _subject[2]
    # Code for picking up the given object.
elif (
    isinstance(_subject, _collections_abc.Sequence)
    and len(_subject) == 3
    and _subject[0] == "pick"
    and _subject[2] == "up"
):
    obj = _subject[1]
    # Code for picking up the given object.

Even though this example is quite simple, the generated code is full of redundant checks. Matching the command "pick Spam up" checks isinstance(_subject, _collections_abc.Sequence) five times, calls len(_subject) five times, and checks _subject[0] == "pick" twice!

However, it's entirely possible for the compiler to generate more efficient "decision trees" instead, where no unnecessary work is performed. Here's what that might look like:

_subject = command.split()
import _collections_abc
if isinstance(_subject, _collections_abc.Sequence):
    _len_subject = len(_subject)
    if _len_subject == 1:
        if _subject[0] == "north":
            # Code for moving north.
    elif _len_subject == 2:
        _subject_0 = _subject[0]
        if _subject_0 == "go":
            if _subject[1] == "north":
                # Code for moving north.
        elif _subject_0 == "get":
            obj = _subject[1]
            # Code for picking up the given object.
    elif _len_subject == 3:
        if _subject[0] == "pick":
            _subject_1 = _subject[1]
            if _subject_1 == "up":
                obj = _subject[2]
                # Code for picking up the given object.
            elif _subject[2] == "up":
                obj = _subject_1
                # Code for picking up the given object.

It starts getting more complicated when you have guards or nested patterns... but I'm hopeful that we'll be able to get something like this into CPython 3.11!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM