简体   繁体   English

在python中引用bazel数据文件的正确方法是什么?

[英]What is the right way to refer bazel data files in python?

Suppose I have the following BUILD file假设我有以下BUILD文件

py_library(
  name = "foo",
  src = ["foo.py"],
  data = ["//bar:data.json"],
)

How should I refer to the data.json in foo.py file?我应该如何引用foo.py文件中的data.json I wanted to have something like below, what should I use for some_path ?我想要像下面这样的东西,我应该为some_path使用什么?

with open(os.path.join(some_path, "bar/data.json"), 'r') as fp:
    data = json.load(fp)

I couldn't find much general documentation about *.runfiles online -- any pointer will be appreciated!我在网上找不到太多关于*.runfiles一般文档——任何指针都将不胜感激!

Here is a function that should return the path to the runfiles root for any py_binary in all the cases that I'm aware of:这是一个函数,它应该在我知道的所有情况下返回任何 py_binary 的运行文件根路径:

import os
import re

def find_runfiles():
    """Find the runfiles tree (useful when _not_ run from a zip file)"""
    # Follow symlinks, looking for my module space
    stub_filename = os.path.abspath(sys.argv[0])
    while True:
        # Found it?
        module_space = stub_filename + '.runfiles'
        if os.path.isdir(module_space):
            break

        runfiles_pattern = r"(.*\.runfiles)"
        matchobj = re.match(runfiles_pattern, os.path.abspath(sys.argv[0]))
        if matchobj:
            module_space = matchobj.group(1)
            break

        raise RuntimeError('Cannot find .runfiles directory for %s' %
                           sys.argv[0])
    return module_space

For the example in your question you could use it like so:对于您问题中的示例,您可以像这样使用它:

with open(os.path.join(find_runfiles(), "name_of_workspace/bar/data.json"), 'r') as fp:
    data = json.load(fp)

Note that this function won't help if you build zipped executables of your python apps (using subpar , probably);请注意,如果您构建 Python 应用程序的压缩可执行文件(可能使用subpar ),则此功能将无济于事; for those you will need some more code.对于那些你需要更多的代码。 This next snippet includes get_resource_filename() and get_resource_directory() , which will work for both regular py_binary and .par binaries:下一个片段包括get_resource_filename()get_resource_directory() ,它们适用于常规 py_binary 和 .par 二进制文件:

import atexit
import os
import re
import shutil
import sys
import tempfile
import zipfile


 def get_resource_filename(path):
    zip_path = get_zip_path(sys.modules.get("__main__").__file__)
    if zip_path:
        tmpdir = tempfile.mkdtemp()
        atexit.register(lambda: shutil.rmtree(tmpdir, ignore_errors=True))
        zf = BetterZipFile(zip_path)
        zf.extract(member=path, path=tmpdir)
        return os.path.join(tmpdir, path)
    elif os.path.exists(path):
        return path
    else:
        path_in_runfiles = os.path.join(find_runfiles(), path)
        if os.path.exists(path_in_runfiles):
            return path_in_runfiles
        else:
            raise ResourceNotFoundError


def get_resource_directory(path):
    """Find or extract an entire subtree and return its location."""
    zip_path = get_zip_path(sys.modules.get("__main__").__file__)
    if zip_path:
        tmpdir = tempfile.mkdtemp()
        atexit.register(lambda: shutil.rmtree(tmpdir, ignore_errors=True))
        zf = BetterZipFile(zip_path)
        members = []
        for fn in zf.namelist():
            if fn.startswith(path):
                members += [fn]
        zf.extractall(members=members, path=tmpdir)
        return os.path.join(tmpdir, path)
    elif os.path.exists(path):
        return path
    else:
        path_in_runfiles = os.path.join(find_runfiles(), path)
        if os.path.exists(path_in_runfiles):
            return path_in_runfiles
        else:
            raise ResourceNotFoundError


def get_zip_path(path):
    """If path is inside a zip file, return the zip file's path."""
    if path == os.path.sep:
        return None
    elif zipfile.is_zipfile(path):
        return path
    return get_zip_path(os.path.dirname(path))


class ResourceNotFoundError(RuntimeError):
    pass

def find_runfiles():
    """Find the runfiles tree (useful when _not_ run from a zip file)"""
    # Follow symlinks, looking for my module space
    stub_filename = os.path.abspath(sys.argv[0])
    while True:
        # Found it?
        module_space = stub_filename + '.runfiles'
        if os.path.isdir(module_space):
            break

        runfiles_pattern = r"(.*\.runfiles)"
        matchobj = re.match(runfiles_pattern, os.path.abspath(sys.argv[0]))
        if matchobj:
            module_space = matchobj.group(1)
            break

        raise RuntimeError('Cannot find .runfiles directory for %s' %
                           sys.argv[0])
    return module_space


class BetterZipFile(zipfile.ZipFile):
    """Shim around ZipFile that preserves permissions on extract."""

    def extract(self, member, path=None, pwd=None):

        if not isinstance(member, zipfile.ZipInfo):
            member = self.getinfo(member)

        if path is None:
            path = os.getcwd()

        ret_val = self._extract_member(member, path, pwd)
        attr = member.external_attr >> 16
        os.chmod(ret_val, attr)
        return ret_val

Using this second code snippet, your example would look like:使用第二个代码片段,您的示例将如下所示:

with open(get_resource_filename("name_of_workspace/bar/data.json"), 'r') as fp:
    data = json.load(fp)

Short answer: os.path.dirname(__file__)简答: os.path.dirname(__file__)

Here is the full example:这是完整的示例:

$ ls
bar/  BUILD  foo.py  WORKSPACE

$ cat BUILD
py_binary(
    name = "foo",
    srcs = ["foo.py"],
    data = ["//bar:data.json"],
)

$ cat foo.py
import json
import os

ws = os.path.dirname(__file__)
with open(os.path.join(ws, "bar/data.json"), 'r') as fp:
  print(json.load(fp))

$ cat bar/BUILD
exports_files(["data.json"])

$ bazel run :foo

Edit: it doesn't work well when your package is in a subdirectory.编辑:当您的包位于子目录中时,它无法正常工作。 You may need to go back using os.path.dirname .您可能需要使用os.path.dirname返回。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM