简体   繁体   English

svn到git转换(如何检查存储库质量)

[英]svn to git conversion (how to check the repository quality)

Currently I'm planning to help with a fairly large git conversion for an open-source project. 目前我正计划为开源项目提供相当大的git转换。 The repository is quite large so trial & error is slow (over 60,000 commits). 存储库非常大,因此试验和错误很慢(超过60,000次提交)。

There are many questions relating to how a git conversion should be done but almost no details on how to check if the conversion is valid. 关于如何进行git转换有很多问题,但几乎没有关于如何检查转换是否有效的细节。

Of course there are basics like setting the revisions in both repos and comparing the contents of the repositories, but history, commit messages... moving files, tracking changes between branches etc - become more involved. 当然,有一些基础知识,比如在repos中设置修订版和比较存储库的内容,但历史,提交消息......移动文件,跟踪分支之间的更改等 - 变得更加复杂。

So my questions are: 所以我的问题是:

  • Which areas should be checked in a newly converted git repository to see that the conversion is correct and succeeded? 应该在新转换的git存储库中检查哪些区域,以确定转换是否正确并成功?
  • What are the the gotcha's/pitfalls to watch out for? 有什么需要注意的问题/陷阱?
  • Can anyone suggest strategies for evaluating a converted svn project to be sure nothing went wrong during conversion? 任何人都可以建议用于评估转换后的svn项目的策略,以确保在转换过程中没有出错吗?

note : currently we're using reposurgeon however that should have no baring on the answer, though it does mean we have to do a once-off conversion and get-it-right . 注意 :目前我们正在使用reposurgeon但是应该没有答案,但这确实意味着我们必须进行一次性转换并获得正确的权利

If you use the git-svn plugin you can just clone your SVN repository with a local git client which effectively creates a Git repo complete with intact history. 如果您使用git-svn插件,您可以使用本地git客户端克隆您的SVN存储库,该客户端可以有效地创建完整历史记录的Git存储库。 Not only is this super easy and super quick but you can trust that you are properly synchronized with the existing svn repo and even pull changes that may have happened after the initial clone. 这不仅非常容易且超级快,而且您可以相信您与现有的svn repo正确同步,甚至可以拉出初始克隆后可能发生的更改。

As for things to look out for, git does not track empty folders. 至于需要注意的事项,git不会跟踪空文件夹。 Another gotcha is dealing with binary files that you hopefully aren't storing in your existing repo. 另一个问题是处理二进制文件,你希望它们不存储在你现有的仓库中。 You dont want to store large binaries in a git repo typically. 你不想通常在git仓库中存储大型二进制文件。 There are a few git specific solutions you can google but it might be tricky if you are pulling them by cloning the svn repo. 有一些git特定的解决方案,你可以google但如果你通过克隆svn repo拉它们可能会很棘手。

This is not a definitive answer, just some things we have been doing to check the git conversion is ok. 这不是一个明确的答案,只是我们一直在做的一些事情来检查git转换是否正常。

Find all commits with 3+ parents, its quite unlikely these are valid, though there may be exceptions where it is. 找到所有3个以上父母的提交,这些提交的可能性很小,尽管可能有例外情况。

git log --all --min-parents=3

Find all commits with duplicate parents. 查找具有重复父项的所有提交。 note that running git filter-branch can clean these up after, but that can sometimes be a really slow process. 请注意,运行git filter-branch之后可以清除它们,但这有时可能是一个非常缓慢的过程。

git log --all --min-parents=2 --format="format:%H: %P" | egrep ':[^:]* ([0-9a-f]+) [^:]*\1'

Find all commits that are not merges and don't change any files (possibly svn props change): 找到所有不合并的提交,不要更改任何文件(可能是svn props更改):

git log --all --max-parents=1 --format="format:%H" --shortstat | pcregrep -v -M "^[a-z0-9]+\n "

Thanks to Julien Rivaud for the regex commands! 感谢Julien Rivaud的正则表达式命令!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM