简体   繁体   中英

Running Python startup code after modules are loaded

I'm working with Jupyter notebooks and Python kernels with a SparkContext. A coworker has written some Python code that wires Spark events with ipykernel events. When we import his module from a notebook cell, it works in all combinations we need to support: Python 2.7 and 3.5, Spark 1.6 and 2.x, Linux only.

Now we want to enable that code automatically for all Python kernels. I put the import into our sitecustomize.py . That works fine for Spark 2.x, but not for Spark 1.6. Kernels with Spark 1.6 don't get an sc anymore, and something is so screwed up that unrelated imports like matplotlib.cbook fail. When I delay that import for a few seconds using a timer, it works. Apparently, the code in sitecustomize.py is executed too early for importing the module which connects Spark with the ipykernel.

I'm looking for a way to delay that import until Spark and/or ipykernel are fully initialized. But it should still execute as part of the kernel startup, before any notebook cells get executed. I found this trick to delay code execution until sys.argv is initialized. But I don't think it can work on global variables like sc , considering that Python globals are still local to modules. So far, the best I can come up with is using a timer to check every second whether certain modules are present in sys.modules . But that isn't very reliable, because I don't know how to distinguish a module that's fully initialized from one that's still in the process of being loaded.

Any ideas on how to hook in startup code that executes late during startup? A solution that is specific to pyspark and/or ipykernel would satisfy my needs.

Hmmm, you don't really give many details about what errors you encounter.

I think the canonical way to customize startup behaviour for the ipython kernel is to setup a config file and set the exec_lines option.

For example you would put in ~/.ipython/profile_default/ipython_config.py

# sample ipython_config.py
c = get_config()

c.InteractiveShellApp.exec_lines = [
    'import numpy',
    'import scipy'
]
c.InteractiveShellApp.exec_files = [
    'mycode.py',
    'fancy.ipy'
]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM