简体   繁体   中英

How can I tell if a package/module is part of Python's std library? without a 3rd party library

I want to review/limit dependencies in an application and I want to ignore everything that's already included in the standard lib for that given Python release. What is a simple Pythonic way to do this?

I have done some work already, but since this seems like a rather basic concern, I wonder if there are better ways. And my solution right now looks dangerously platform-dependent (I am on macos and it might work on Linux, doubt it will on Windows).

Basically, I look at the directory where a known stdlib module is located and then grab python files directories from it.

what I have so far

#parent directory for the `os` module.
pa_lib_builtins = Path(os.__file__).parent
s_stdlib = set()

for pa_py in pa_lib_builtins.glob("*"):

    #skip some directories
    if pa_py.name in {"site-packages", "lib-dynload"}:
        continue

    #add modules/packages
    if pa_py.suffix == ".py":
        #a direct module, add it.
        s_stdlib.add(pa_py.stem)
    else:
        #it's a package, add it
        if pa_py.is_dir():
            s_stdlib.add(pa_py.name)

#looks too operating-system dependent... add the stuff in lib-dynload
pa_lib_dynload = pa_lib_builtins / "lib-dynload"
for pa in pa_lib_dynload.glob("*"):
    s_stdlib.add(pa_py.stem.split(".")[0])

#add stuff that I can't find on the file system but is registered as built-in
#this brings in `sys`, for example.
for name in sys.builtin_module_names:
    s_stdlib.add(name)

contents of lib-dynload on macos (I only grab left of first . ).

.
..
_asyncio.cpython-36m-darwin.so
_bisect.cpython-36m-darwin.so
_blake2.cpython-36m-darwin.so

I there a better way? Note that I am not overly concerned about false positives. For example, considering __pycache__ as a module doesn't really hurt as I won't have any import __pycache__ to worry about.

This mostly does the trick by introspecting the file system at run-time. Unlike a documentation parse, it is sensitive to what's actually installed, on the current OS. And it works without internet access as well.

So - from comparing to https://github.com/jackmaney/python-stdlib-list for 3.6? :

  • no msilib or msvcr on my mac.
  • picked up posixpath .

I will run this against Windows & Linux later and update it as necessary, but I anticipate it should work without too much hassles.


def get_stdlib_names(try_import=False):
    """ 
    get stdlib module/package names 
    by globbing file system
    """

    import pathlib
    Path = pathlib.Path

    pa_dyn_name = os_specific = pa_dyn = None
    s_stdlib0 = set()

    try:
        # Unix services
        import syslog as os_specific
    except ModuleNotFoundError:
        try:
            # Windows services
            import msilib as os_specific
        except ModuleNotFoundError:
            pass

    # parent directory for `pathlib`  itself
    pa_stdlib = Path(pathlib.__file__).parent

    # OS-dependent/dynamic libraries
    if os_specific:
        pa_dyn = Path(os_specific.__file__).parent
        pa_dyn_name = pa_dyn.name

    for pa_py in pa_stdlib.glob("*"):

        # skip directories we don't want
        if pa_py.name in {"site-packages", pa_dyn_name, "test"}:
            continue

        # add modules (*.py)/packages (directories
        if pa_py.suffix == ".py":
            s_stdlib0.add(pa_py.stem)
        else:
            if pa_py.is_dir():
                s_stdlib0.add(pa_py.name)

    # load dynamic libraries
    if os_specific:
        # consider qualifying glob with `.dll`/`.so` extension
        for pa in pa_dyn.glob("*"):
            name = pa.stem.split(".")[0]
            s_stdlib0.add(name)

    # add stuff not on the file system, like `sys`
    for name in sys.builtin_module_names:
        s_stdlib0.add(name)

    # skip `_` aliases for modules:
    #  .../lib/python3.6/asyncio/__init__.py
    #  .../lib/python3.6/lib-dynload/_asyncio.cpython-36m-darwin.so
    s_stdlib0 = {
        name
        for name in s_stdlib0
        if (not name.startswith("_"))
        # special cases
        or name in ("__future__ _thread _dummy_thread".split())
    }

    s_stdlib0.add("__main__")

    # do we want to check if we can import it?
    if not try_import:
        return s_stdlib0

    s_stdlib = set()

    for name in s_stdlib0:

        # side effects/won't import
        if name in ("antigravity", "this", "__main__"):
            s_stdlib.add(name)
            continue
        try:
            importlib.import_module(name)
            s_stdlib.add(name)
        except ModuleNotFoundError as e:
            pass

    return s_stdlib

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM