简体   繁体   中英

Which is the better way to handle ImportError - Raise error or import chain?

Question

When encounter ImportError in python, should I directly raise the error and ask the user to install it, or should I use import chain?

Description

I came across this question when I tried to use lxml package to parse xml file in python.
In its official documentation, it says:

If your code only uses the ElementTree API and does not rely on any functionality that is specific to lxml.etree, you can also use (any part of) the following import chain as a fall-back to the original ElementTree:

try:
    from lxml import etree
    print("running with lxml.etree")
except ImportError:
    try:
        import xml.etree.cElementTree as etree
        print("running with cElementTree on Python 2.5+")
    except ImportError:
        ...

It seems to me that it's a bad idea to import a substitution since:
if you can import another library as a substitution, which may not have all the methods as lxml, then all your script can only based on those available methods in all the packages .

Then it make less sense to import the most powerful package (eg lxml here), we could directly import the least functional one, and save a lot codes. Or if we want to use additional methods later, we then should directly raise the ImportError.

However, as answered in Error handling when importing modules , I find this approach seems to be used frequently in python programming:

it's useful to define multi-level functionality based on what library has been imported in runtime.

But it seems to me that, the multi-level functionality can only be achieved by constantly checking whether one library has been imported, which makes the whole codes complicated and ugly.

As a result, I just wondered why people sometimes use such structure, instead of raise the error directly?

To answer your last question first:

When encounter ImportError in python, should I directly raise the error and ask the user to install it, or should I use import chain?

You can handle ImportError s for many reasons:

  • If your module directly depends on a module, let the error happen. Some libraries re-raise the error with a helpful error message if the dependency's installation is non-trivial.
  • If your module is trying to substitute slower libraries for faster ones with identical APIs, there's no reason to print anything to screen.
  • If your module expects a certain library to exist but a significantly slower one is the only one you can find, a warning may be useful to let the developer know that your module will still function but will not be as fast as it should.

Now for your other questions:

Then it make less sense to import the most powerful package (eg lxml here), we could directly import the least functional one, and save a lot codes.

In the specific case of lxml.etree , ElementTree , and cElementTree , all three implement the same API. They're substitutes for one another. ElementTree is pure-Python and will always work, but cElementTree is usually present and is faster. lxml.etree is even faster but is an external module.

Think of it like this:

try:
    import super_fast_widget as widget
except ImportError:
    try:
        import fast_widget as widget
    except ImportError:
        import slow_widget as widget

From your code's perspective, widget will always work the same regardless of which library actually ended up getting imported, so it's best to try to import the fastest implementation and fall back on slower ones if performance is something you care about.

You are correct in that you can't fully utilize all of lxml 's features if you allow fallback libraries. This is why lxml.etree is being used instead of just lxml . It intentionally mimics the API of the other two libraries.

Here's a similar example from Django's codebase:

# Use the C (faster) implementation if possible
try:
    from yaml import CSafeLoader as SafeLoader
    from yaml import CSafeDumper as SafeDumper
except ImportError:
    from yaml import SafeLoader, SafeDumper

Python internally does this for a lot of built-in modules. There's a slower, pure Python version that's used as a fallback for the faster C version.

However, as answered in Error handling when importing modules , I find this approach seems to be used frequently in python programming:

Your lxml.etree example substituted slower libraries for faster ones. The linked example code defines a common, cross-platform interface ( getpass ) to a bunch of libraries that all do the same thing (prompt you for your password). The author handles the ImportError s because those individual modules may not exist depending on your operating system.

You could replace some of the try blocks with if platform.system() == 'Windows' and similar code, but even among a single OS there may be better modules that perform an identical task so the try blocks just simplify it. In the end getpass still prompts the user for their password with the exact same API, which is all you really care about.

I usually use import chains because the output is more controlled.

Raising Errors

Traceback (most recent call last):
  File "core.py", line 1, in <module>
ImportError: <error description>

Import Chains

i Importing "lxml.etree"
x Error Importing "lxml.etree"
i Importing "xml.etree.cElementTree" on Python 2.5+
x Error Importing "xml.etree.cElementTree" on Python 2.5+
i Please Install "lxml.etree" or "xml.etree.xElementTree" on Python 2.5+
i Exit with code 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM