简体   繁体   English

如何拆分git存储库并按照目录重命名?

[英]How to split a git repository and follow directory renames?

I currently have a big git repository that contains many projects, each one in its own subdirectory. 我目前有一个包含许多项目的大型git存储库,每个项目都在自己的子目录中。 I need to split it into individual repositories, each project in its own repo. 我需要将它拆分为单独的存储库,每个项目都在自己的仓库中。

I tried git filter-branch --prune-empty --subdirectory-filter PROJECT master 我试过git filter-branch --prune-empty --subdirectory-filter PROJECT master

However, many project directories went through several renames in their lives, and git filter-branch does not follow renames, so effectively the extracted repo does not have any history prior to the last rename. 但是,许多项目目录在其生命中经历了多次重命名,并且git filter-branch不遵循重命名,因此有效地提取的repo在上次重命名之前没有任何历史记录。

How can I effectively extract a subdirectory from one big git repo, and follow all that directory's renames back into the past? 如何从一个大的git repo中有效地提取子目录,并将所有该目录重命名回到过去?

Thanks to @Chronial, I was able to cook a script to massage my git repo according to my needs: 感谢@Chronial,我根据自己的需要制作了一个脚本来按摩我的git repo:

git filter-branch --prune-empty --index-filter '
    # Delete files which are NOT needed
    git ls-files -z | egrep -zv  "^(NAME1|NAME2|NAME3)" | 
        xargs -0 -r git rm --cached -q             
    # Move files to root directory
    git ls-files -s | sed -e "s-\t\(NAME1\|NAME2\|NAME3\)/-\t-" |
        GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
        git update-index --index-info &&
        ( test ! -f "$GIT_INDEX_FILE.new" \
            || mv -f "$GIT_INDEX_FILE.new" "$GIT_INDEX_FILE" )
'

Basically what this does is this: 基本上这是做什么的:

  1. Deletes all files outside of the three directories NAME1, NAME2 or NAME3 that I need (one project was renamed NAME1 -> NAME2 -> NAME3 during its lifetime). 删除我需要的三个目录NAME1,NAME2或NAME3 之外的所有文件(一个项目在其生命周期内重命名为NAME1 - > NAME2 - > NAME3)。

  2. Moves everything inside these three directories to the root of the repository. 一切行动这三个目录到库的根。

  3. I needed to test if "$GIT_INDEX_FILE.new" exists since import of svn into git creates commits without any files (directory-only commits). 我需要测试“$ GIT_INDEX_FILE.new”是否存在,因为将svn导入git会创建没有任何文件的提交(仅限目录的提交)。 Needed only if the repo was created with 'git svn clone' initially. 仅当repo最初是使用'git svn clone'创建时才需要。

I don't think git has a build-in feature for that. 我认为git没有内置功能。 You will have to build your own filter. 您必须构建自己的过滤器。 Just use git filter-branch --prune-empty --tree-filter YOURSCRIPT . 只需使用git filter-branch --prune-empty --tree-filter YOURSCRIPT Your script will then have to identify the correct folder (maybe by the name of a specific file in it or maybe you have a list of all the names this project had in the past), remove everything else and move the folder contents up a level. 然后,您的脚本必须识别正确的文件夹(可能是其中的特定文件的名称,或者您可能有此项目过去所有名称的列表),删除其他所有文件夹并将文件夹内容移动到一个级别。

If your repo is really big and you don't have night to run this script, you can achieve the same effect a lot faster with --index-filter , but writing that script will be more complicated. 如果您的repo非常大并且您没有夜间运行此脚本,那么使用--index-filter可以更快地实现相同的效果,但编写该脚本会更复杂。 You will have to use the git commands for modifying the index instead of file system modification commands. 您将不得不使用git命令来修改索引而不是文件系统修改命令。

I had a very large repository from which I needed to extract a single folder; 我有一个非常大的存储库,我需要从中提取一个文件夹; even --index-filter was predicted to take 8 hours to finish. 甚至--index-filter预计需要8个小时才能完成。 Here's what I did instead: 这是我做的事情:

  1. Obtain a list of all the past names of the folder. 获取该文件夹的所有过去名称的列表。 In my case there were only two, old-name and new-name . 在我的情况下,只有两个, old-namenew-name
  2. For each name: 对于每个名字:

     $ git checkout master $ git checkout -b filter-old-name $ git filter-branch --subdirectory-filter old-name 

    This will give you several disconnected branches, each containing history for one of the names. 这将为您提供多个断开连接的分支,每个分支包含其中一个名称的历史记录。

  3. The filter-old-name branch should end with the commit which renamed the folder, and the filter-new-name branch should begin with the same commit. filter-old-name分支应以重命名文件夹的提交结束filter-new-name分支应以相同的提交开头 (The same applies if there was more than one rename: you'll wind up with an equivalent number of branches, each with a commit shared with the next one along.) One should delete everything and the other should recreate it again. (如果存在多个重命名,则同样适用:您将使用相同数量的分支,每个分支都与下一个分支共享。)一个应该删除所有内容,另一个应该重新创建它。 Make sure that these two commits have identical contents; 确保这两个提交具有相同的内容; if they don't, the file was modified in addition to being renamed, and you will need to merge the changes. 如果不这样做,除了重命名之外,文件也被修改,您需要合并更改。 (In my case I didn't have this problem so I don't know how to solve it.) (在我的情况下,我没有这个问题所以我不知道如何解决它。)

    An easy way to check this is to try rebasing filter-new-name on top of filter-old-name and then squashing the two commits together: git should complain that this produces an empty commit. 检查这个的一个简单方法是尝试在filter-old-name之上重新设置filter-new-name ,然后将两个提交压缩在一起:git应该抱怨这会产生一个空提交。 (Note that you will want to do this on a spare branch and then delete it: rebasing deletes the Committer information from the commits, thus losing some of the history you want to keep.) (请注意,您需要在备用分支上执行此操作,然后将其删除:rebasing从提交中删除提交者信息,从而丢失您要保留的一些历史记录。)

  4. The next step is to graft the two branches together, skipping the two commits which renamed the folder. 下一步是将两个分支移植到一起, 跳过重命名文件夹的两个提交。 (Otherwise there will be a weird jump where everything is deleted and recreated.) This involves finding the full SHA (all 40 characters!) of the two commits and putting them into git's info, with the new name branch's commit first, and the old name branch's commit second. (否则将会有一个奇怪的跳转,其中所有内容都被删除并重新创建。)这包括找到两个提交的完整SHA(全部40个字符!)并将它们放入git的信息中,首先使用名称分支的提交,然后使用旧的 name branch的提交秒。

     $ echo $NEW_NAME_SECOND_COMMIT_SHA1 $OLD_NAME_PENULTIMATE_COMMIT_SHA1 >> .git/info/grafts 

    If you've done this right, git log --graph should now show a line from the end of the new history to the start of the old history. 如果你做得对, git log --graph现在应该显示从新历史的结尾到旧历史的开头的一行。

  5. This graft is currently temporary: it is not yet part of the history, and won't follow along with clones or pushes. 这种移植物目前是暂时的:它还不是历史的一部分,也不会跟随克隆或推动。 To make it permanent: 使它永久化:

     $ git filter-branch 

    This will refilter the branch without trying to make any further changes, making the graft permanent (changing all of the commits in the filter-new-name branch). 这将重新filter-new-name分支而不尝试进行任何进一步的更改,使移植永久化(更改filter-new-name分支中的所有提交)。 You should now be able to delete the .git/info/grafts file. 您现在应该能够删除.git/info/grafts文件。

At the end of all of this, you should now have on the filter-new-name branch all of the history from both names for the folder. 在所有这些结束时,您现在应该在filter-new-name分支上具有该文件夹的两个名称的所有历史记录。 You can then use this separate repository, or merge it into another one, or whatever you'd like to do with this history. 然后,您可以使用此单独的存储库,或将其合并到另一个存储库中,或者您想要对此历史记录执行的任何操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM