简体   繁体   中英

Python Relative Path Import: Import packages from another project directory

I know there are many questions on this topic but none of them helped me much.

I have a python project directory ie git_project (git repository). What I want is to create a separate directory called notebooks where I will keep all my notebooks for analysis using git_project. I don't want to put notebooks within the root of git_project. I have kept both git_project and notebooks directory in a general directory where keep all of my projects. I have the following structure:

my_projects
│ 
├── notebooks
│   └── notebook.ipynb
└── git_project
    └── config
        └── cfg.json
    └── source
        └── config.py

The contents of config.py :

import json
def get_cfg():
     with open('config/cfg.json', 'r') as f:
         cfg = json.load(f)
     return cfg

Contents of the notebook.ipynb :

import sys
import os

module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

from git_project.source.config import get_cfg
get_cfg()

Now when I run the code in notebook.ipynb I get the following error:

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-7-6796ee7f0100> in <module>
----> 1 get_cfg()

~/Documents/my_projects/git_project/source/config.py in get_cfg()
      1 def get_cfg():
----> 2     with open('config/cfg.json', 'r') as f:
      3         cfg = json.load(f)
      4     return cfg

FileNotFoundError: [Errno 2] No such file or directory: 'config/cfg.json'

However, If I move the notebook.ipynb file to the root of git_project. Then I do not get this error. This is just one example. I have so many similiar problems in other modules of git_project and git_project contains the code which is already running in the production environment. So changing anything in git_project is not feasible here. But as I said I do not want to move the notebooks inside of git_project but rather like to keep them in a parallel directory for analysis purposes. I can provide more information, if required.

I am using Python 3.6+ which does not even require to put init .py file anymore to make a directory package.

What should I do in order for this to work? Any help will be much appreciated.

The issue was when you call open('config/cfg.json', 'r') , the path it opens was relative to the directory where the python code is launched. In this case, it is your my_projects/notebook directory. You can see this by adding the following prints in get_cfg() inside config.py :

print(os.getcwd())  # this prints out the current working directory
print(__file__)     # this prints out the path of this script

As Ahmet suggested, modfiying the path to ../git_project/config/cfg.json will work, but your python implementation will be tied to the notebook folder location. If you decide to restructure the notebook folder, it will break again. One potential way is to parse the script path: __file__ :

import json
import os
def get_cfg():
    script_dirname = os.path.dirname(__file__)
    config_path = os.path.join(script_dirname, '..', 'config', 'cfg.json')
    with open(config_path, 'r') as f:
        cfg = json.load(f)
    return cfg

Similar suggestion: ( Reading file using relative path in python project ). This is also the approach that is suggested in python-packaging docs :

Files which are to be used by your installed library (eg data files to support a particular computation method) should usually be placed inside of the Python module directory itself. ... That way, code which loads those files can easily specify a relative path from the consuming module's __file__ variable.

If you don't want to touch the current file inside git_project , you can run a change directory command in your python notebook to point to the right location:

In [1]: %cd ../git_project

This line needs to be called once each time you restart the notebook kernel. You can verify the current working directory in the notebook as well:

In [2]: %ls

As a followup from the discussion in the comments...

Even though approaches with relative paths work, often it is better to use a more scalable approach - environment variable with your project root.

In this case, your notebooks are truly independent of the project and can be used with as many as you want. Here is a great explanation of how to use ENV variables in Jupiter. I prefer to use dotenv approach. Create .env in your notebooks folder. Add your variable with your project path:

MY_PROJECT_ROOT=/usr/any/path/you/want

Then in your notebook

import os
from dotenv import load_dotenv
load_dotenv()  # this line loads .env file

then your code

module_path = os.path.abspath(os.getenv('MY_PROJECT_ROOT'))
if module_path not in sys.path:
    sys.path.append(module_path)

A simple solution is to change the working directory.

Initially the working directory is the notebooks directory:

from sys import path
import os
print("Current Working Directory " , os.getcwd())

Output:

Current Working Directory  /home/user/git_project/notebooks

Then you change it to the root of your project:

os.chdir(os.path.dirname(path[0]))
print("New Working Directory " , os.getcwd())

Output:

New Working Directory  /home/user/git_project

After that, all imports with relative path on the project root directory should work.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM