简体   繁体   中英

Handle large repository with binaries in git-svn

At my workplace there is a large svn repository (+80.000 revisions) with lots of binary files. I am experimenting with git-svn over it, but it seems impractical to clone the whole history (it takes more than 100 GB and nearly a week to complete the process).

I have tried cloning a subset of revisions (last ~10.0000) and that works reasonably well. The main drawback of this approach is that blames only go up to the oldest revision I fetched.

Ideally, I would like to clone the whole history for source files and only the last thousand revisions for binaries. Is that somehow possible? Any other suggestions?

I've ran into the same issue at my workplace and so I'll share my solution.

The solution was not, unfortunately, to do what you're envisioning (though I did originally think of that too). The solution is the refactor the repository, separating binaries from sources. This is easier said than done, as you will need to get your department on board and it will impact your team's workflow, but if you can pull it off, it will be worth it.

There are really three types of files to consider:

  • Sources should be isolated in a repository. That's simple enough to understand.
  • 3rd party binaries may also be committed to the repository, though importing them through svn:externals avoids lots of potential duplication. These binaries aren't so bad because you won't have lots of history with them.
  • Generated binaries (outputs of your compilation) are the worst by far! These change with every compilation and maintaining the history doesn't make sense. VCS systems aren't intended for dealing with this. Some companies love commiting binaries because they can check out the latest load without compiling it, but there is a huge cost.

The solution that I've been implementing is to make all binaries in a major product build and package from a single command. Then I will build, package, and archive nightly (or on-demand) builds from a build machine. People can get the latest binaries from the build machine and as long as the packages are install-friendly, it's even easier than doing an svn up because you won't have so many updates/conflicts/merges. This brings generated binaries completely out of SVN.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM