Is there a way to confine Git to a sparse checkout?

Question

As a recent question hinted, I'm looking for a way to speed up operations on a Git repository with a very large number of files (~6 million). I'd rather not use submodules. The problem is that operations are pretty slow. Is it possible to have one large repository but instruct Git to focus on only a portion of the repository? I thought that maybe creating a sparse-checkout would do it but the read-tree operation seems to delete files not specified in the sparse-checkout file and takes a really long time. Is it possible to do a read-tree keeping all the files where they are and is proportional only to the number of files specified in the sparse-checkout file?

Answer 1

Not currently, no. Git only recently (1.7+) added any sparse checkout support at all, and it's still fairly bare bones - mostly because Git wasn't really designed to handle only working with part of a repository.

It was more designed to be a one-repository-per-project version control system. Submodules were the method chosen to handle "projects" that had many large subcomponents.

Answer 2

First, I would suggest learning and using Submodules.

You can script what you like with

git ls-tree sha1
git show sha1:path/to/some/file.txt

and other low level commands. Also see bash commands such as

xargs
grep
cut

and piping.

Is there a way to confine Git to a sparse checkout?

Question

2 answers

solution1
1 ACCPTED 2012-03-14 02:26:58

solution2
0 2012-03-14 02:25:51

Is there a way to confine Git to a sparse checkout?

Question

2 answers

solution1 1 ACCPTED 2012-03-14 02:26:58

solution2 0 2012-03-14 02:25:51

solution1
1 ACCPTED 2012-03-14 02:26:58

solution2
0 2012-03-14 02:25:51