I have a bazel genrule running a custom tool to process certain set of input files and generate an output. The problem is that the custom tool takes a long time whenever it runs, but not every change in the input files set matters to the output of the custom tool. To detect whether the changes matter, I have another script that can parse through the inputs and quickly provide information if the custom tool's output is going to be any different.
I am not able to implement the above in Bazel. The way I would like to implement is as below
INPUT_FILES --------> [RULE1] --------> OUTPUT
| ^
| |
| |
--------------> [RULE2]
The RULE2's
output should decide whether RULE1
should run or not. But when it has to run, INPUT_FILES
should be available to RULE1
. So essentially only RULE2's
output should be accounted for cache hit/miss calculations while executing RULE1
and INPUT_FILES
should be ignored. Is there a way to accomplish this?
EDIT: I tried some experiments and I am able to implement this if I execute RULE1
and RULE2
with sandboxing disabled. That allows RULE2
to access RULE1
's inputs without explicitly listing them. This seems hacky, but could be fine if there was a way to share a single sandbox for the rules instead of executing both without a sandbox.
I'm not aware of a way to do what you're describing, however there are other strategies that might work for you. (There's an additional complication, I think, which is that RULE2
wouldn't have access to the previous state of INPUT_FILES
, so it wouldn't have anything to compare against to see what has changed in the inputs).
One strategy is to process the input files so that all the inconsequential parts are removed, and the long-running tool in RULE1
only ever sees the "important" stuff. This, of course, depends on exactly what your tools and rules do, but it might work.
As a simple example, you could have a tool that removes comments from code (in a way that preserves line numbers), and then the compiler action only ever sees code-only files. So, if you make a change to a comment, the input to the compiler is the same, and bazel skips the action.
This is similar to what bazel does to make building java rules more incremental. There's a tool that generates a "header jar" from java source code, which contains only the class interfaces, and upstream rules only see the header jar. That way, only changes to the interfaces of classes ever cause upstream rules to be rerun, and changes to comments or method implementations don't.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.