简体   繁体   English

cvs2git迁移与git和cvs不同的二进制文件(标记为-kb)

[英]cvs2git migrates binary files (flagged with -kb) that are different from git and cvs

I've run a cvs2git migration on a CVS repository that's over 2 GB. 我在一个超过2 GB的CVS存储库上运行了cvs2git迁移。 I wrote a script traverses the new git repository and the CVS module to verify that the objects are the same. 我写了一个脚本遍历新的git存储库和CVS模块来验证对象是否相同。 I've found that the text files migrate just fine and have the same sha1sum; 我发现文本文件迁移得很好并且具有相同的sha1sum; however, ALL of the binary files have different sha1sums and they are all flagged as binary in CVS (-kb). 但是,所有二进制文件都有不同的sha1sums,它们都在CVS(-kb)中标记为二进制文件。 Every other topic I've read about cvs2git and binary files usually blame the issue on binary files not being flagged as binar (-kb), but that's not the case here. 我读过的关于cvs2git和二进制文件的每个其他主题通常都会将二进制文件的问题归咎于没有被标记为二进制文件(-kb),但这不是这里的情况。 What else could be the problem? 还有什么可能是问题?

The scripts I execute to do the migration are below: 我执行迁移的脚本如下:

./Python-2.7.3/python ./cvs2svn-trunk/cvs2git \
--blobfile=/path/to/git-blob.dat \
--dumpfile=/path/to/git-dump.dat \
--username=cvs2git \
/cvsroot/database

cd /gitroot; mkdir database; cd database; git init

cat /path/to/git-{blob,dump}.dat | git fast-import

Your problem could be explained if your repository is a CVSNT repository, as opposed to a standard CVS repository. 如果您的存储库是CVSNT存储库,而不是标准的CVS存储库,则可以解释您的问题。 CVS records once, for all revisions whether a file is binary, whereas CVSNT records the file type revision by revision . CVS记录一次,对于所有修订 ,文件是否为二进制,而CVSNT记录文件类型修订版本 cvs2svn/cvs2git only reads the file-wide binary attribute, not CVSNT's revision-by-revision attributes. cvs2svn / cvs2git只读取文件范围的二进制属性,而不是CVSNT的逐个修订版属性。 Therefore, it doesn't know that a file has been marked binary in CVSNT. 因此,它不知道文件在CVSNT中已标记为二进制。

This is the main reason that cvs2svn/cvs2git does not officially support converting from CVSNT repositories . 这是cvs2svn / cvs2git 不正式支持从CVSNT存储库转换的主要原因。

Do these binary files contain some strings in the form of $Id ...$ ? 这些二进制文件是否包含$Id ...$形式的一些字符串? That was the problem for me some time ago (it replaced it with $Id$ in binary files), but it should be fixed in the newest versions, see this commit . 这是我前段时间的问题(它用二进制文件中的$Id$替换它),但它应该在最新版本中修复,请参阅此提交

In any case, I recommend using a hex editor to find out what the differences actually are. 无论如何,我建议使用十六进制编辑器来找出实际上存在的差异。

I also notice that you don't use an options file. 我还注意到你没有使用选项文件。 I'm not sure what defaults cvs2git uses then, but it would be worth a try use a customized version of cvs2git-example.options . 我不确定cvs2git当时使用的默认值,但是尝试使用cvs2git-example.options的自定义版本是值得的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM