简体   繁体   中英

Check out single subdirectory from minimal repository using JGit

I'm using JGit 6.5.x with Java 17. I have a remote repository that is huge (gigabytes), but I only need temporary access to a single subdirectory (eg foo/bar/ ) for processing. The single subdirectory is really small (hundreds of kilobytes). Cloning a shallow, bare repository is relatively small as well:

try (final Git git = Git.cloneRepository()
    .setURI(REMOTE_REPOSITORY_URI.toASCIIString())
    .setDirectory(LOCAL_RESPOSITORY_PATH.toFile())
    .setBare(true)
    .setDepth(1)
    .call()) {
  System.out.println("cloned shallow, bare repository");
}

Is there a way to clone a shallow, bare repository like that (or any other minimal version of the repository), and then check out just the single subdirectory foo/bar to some other directory temporarily so that I can process those files using the normal Java file system API?

Note that I just now succeeded in the the clone above and haven't started looking into how I might check out just a single subdirectory from this bare repository.

Try below solution:

Note: Before apply any git changes make sure you have backup for necessary files.

Use the git object to create a TreeWalk that will allow you to traverse the repository's tree and find the subdirectory you're interested in. Specify the starting path as the root of the repository:

try (Git git = Git.open(LOCAL_REPOSITORY_PATH.toFile())) {
    Repository repository = git.getRepository();

    // Get the tree for the repository's HEAD commit
    RevWalk revWalk = new RevWalk(repository);
    RevCommit commit = revWalk.parseCommit(repository.resolve(Constants.HEAD));
    RevTree tree = commit.getTree();

    // Create a TreeWalk starting from the root of the repository
    TreeWalk treeWalk = new TreeWalk(repository);
    treeWalk.addTree(tree);
    treeWalk.setRecursive(true);
    
    // Specify the path of the subdirectory you want to check out
    treeWalk.setFilter(PathFilter.create("foo/bar"));

    if (!treeWalk.next()) {
        throw new IllegalStateException("Subdirectory not found");
    }

    // Get the ObjectId of the subdirectory's tree
    ObjectId subdirectoryTreeId = treeWalk.getObjectId(0);
    treeWalk.close();
    
    // Create a new Git object with the shallow, bare repository
    Git subGit = new Git(repository);

    // Checkout the subdirectory's tree to a temporary directory
    Path temporaryDirectory = Files.createTempDirectory("subdirectory");
    subGit.checkout().setStartPoint(subdirectoryTreeId.getName()).setAllPaths(true).setForce(true).setTargetPath(temporaryDirectory.toFile()).call();

    // Now you can use the Java file system API to process the files in the temporary directory
    
    // Clean up the temporary directory when you're done
    FileUtils.deleteDirectory(temporaryDirectory.toFile());
}

In the code above, we use a TreeWalk to traverse the repository's tree and find the subdirectory you specified (foo/bar). We then get the ObjectId of the subdirectory's tree and create a new Git object with the repository. Finally, we use checkout() to check out the subdirectory's tree to a temporary directory, and you can use the Java file system API to process the files in that directory. Don't forget to clean up the temporary directory when you're done.

Note that the code assumes you have the necessary JGit and Java IO imports in place.

Inspired by another answer I was able get a single-depth clone and check out only a single path without needing to do a bare clone, while using similar minimal file system space. The benefit to this approach is that only a single top-level directory is needed; the bare repository approach on the other hand requires a manual traversal and saving to a separate drop-level directory.

The key is to use setNoCheckout(true) (in addition to setDepth(1) ), and then after cloning manually perform a separate checkout specifying the requested path. Note that you must specify setStartPoint("HEAD") or specify a hash starting point, as there will be no branch because there is not yet a checkout.

try (final Git git = Git.cloneRepository()
    .setURI(REMOTE_REPOSITORY_URI.toASCIIString())
    .setDirectory(LOCAL_RESPOSITORY_PATH.toFile())
    .setNoCheckout(true)
    .setDepth(1)
    .call()) {

  gitRepository.checkout()
    .setStartPoint("HEAD")
    .addPath("foo/bar")
    .call();

}

This seems to work very nicely! I would imagine it uses something similar to Satyajit Bhatt's answer under the hood.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM