简体   繁体   English

如何从 git 存储库中删除旧历史记录?

[英]How do I remove the old history from a git repository?

I'm afraid I couldn't find anything quite like this particular scenario.恐怕我找不到像这种特殊情况的任何东西。

I have a git repository with a lot of history: 500+ branches, 500+ tags, going back to mid-2007.我有一个 git 存储库,它有很多历史:500 多个分支,500 多个标签,可以追溯到 2007 年中期。 It contains ~19,500 commits.它包含约 19,500 次提交。 We'd like to remove all of the history before Jan 1, 2010, to make it smaller and easier to deal with (we would keep a complete copy of the history in an archive repository).我们希望删除 2010 年 1 月 1 日之前的所有历史记录,以使其更小且更易于处理(我们会将历史记录的完整副本保存在存档存储库中)。

I know the commit that I want to have become the root of the new repository.我知道我想要成为新存储库根的提交。 I can't, however, figure out the correct git mojo to truncate the repo to start with that commit.但是,我无法找出正确的 git mojo 来截断 repo 以从该提交开始。 I'm guessing some variant of我猜一些变种

git filter-branch

involving grafts would be necessary;涉及移植物是必要的; it might also be necessary to treat each of the 200+ branches we want to keep separately and then patch the repo back together (something I do know how to do).可能还需要分别处理我们想要单独保留的 200 多个分支中的每一个,然后将 repo 修补在一起(我确实知道该怎么做)。

Has anyone ever done something like this?有没有人做过这样的事情? I've got git 1.7.2.3 if that matters.如果重要的话,我有 git 1.7.2.3。

Maybe it's too late to post a reply, but as this page is the first Google's result, it may still be helpful.也许现在回复已经太晚了,但由于这个页面是 Google 的第一个结果,所以它可能仍然有帮助。

If you want to free some space in your git repo, but do not want to rebuild all your commits (rebase or graft), and still be able to push/pull/merge from people who has the full repo, you may use thegit clone shallow clone ( --depth parameter).如果您想释放 git 仓库中的一些空间,但不想重建所有提交(rebase 或嫁接),并且仍然能够从拥有完整仓库的人那里推/拉/合并,您可以使用git克隆克隆( --depth参数)。

; Clone the original repo into limitedRepo
git clone file:///path_to/originalRepo limitedRepo --depth=10

; Remove the original repo, to free up some space
rm -rf originalRepo
cd limitedRepo
git remote rm origin

You may be able to shallow your existing repo, by following these steps:您可以按照以下步骤浅化现有的存储库:

; Shallow to last 5 commits
git rev-parse HEAD~5 > .git/shallow

; Manually remove all other branches, tags and remotes that refers to old commits

; Prune unreachable objects
git fsck --unreachable ; Will show you the list of what will be deleted
git gc --prune=now     ; Will actually delete your data

How to remove all git local tags? 如何删除所有 git 本地标签?

Ps: Older versions of git didn't support clone/push/pull from/to shallow repos. Ps:旧版本的 git 不支持从/到浅存储库的克隆/推/拉。

Just create a graft of the parent of your new root commit to no parent (or to an empty commit, eg the real root commit of your repository).只需将您的新根提交的父项移植到无父项(或空提交,例如存储库的真正根提交)。 Eg echo "<NEW-ROOT-SHA1>" > .git/info/grafts例如echo "<NEW-ROOT-SHA1>" > .git/info/grafts

After creating the graft, it takes effect right away;创建嫁接后,立即生效; you should be able to look at git log and see that the unwanted old commits have gone away:您应该能够查看git log并看到不需要的旧提交已经消失:

$ echo 4a46bc886318679d8b15e05aea40b83ff6c3bd47 > .git/info/grafts
$ git log --decorate | tail --lines=11
commit cb3da2d4d8c3378919844b29e815bfd5fdc0210c
Author: Your Name <your.email@example.com>
Date:   Fri May 24 14:04:10 2013 +0200

    Another message

commit 4a46bc886318679d8b15e05aea40b83ff6c3bd47 (grafted)
Author: Your Name <your.email@example.com>
Date:   Thu May 23 22:27:48 2013 +0200

    Some message

If all looks as intended, you can just do a simple git filter-branch -- --all to make it permanent.如果一切看起来都符合预期,您只需执行一个简单的git filter-branch -- --all即可使其永久化。

BEWARE: after doing the filter-branch step, all commit ids will have changed, so anybody using the old repo must never merge with anyone using the new repo.注意:在执行filter-branch步骤后,所有提交 ID 都将发生变化,因此使用旧仓库的任何人都不得与使用新仓库的任何人合并。

This method is easy to understand and works fine.这种方法很容易理解并且效果很好。 The argument to the script ( $1 ) is a reference (tag, hash, ...) to the commit starting from which you want to keep your history.脚本 ( $1 ) 的参数是对要保留历史记录的提交的引用(标签、哈希、...)。

#!/bin/bash
git checkout --orphan temp $1 # create a new branch without parent history
git commit -m "Truncated history" # create a first commit on this branch
git rebase --onto temp $1 master # now rebase the part of master branch that we want to keep onto this branch
git branch -D temp # delete the temp branch

# The following 2 commands are optional - they keep your git repo in good shape.
git prune --progress # delete all the objects w/o references
git gc --aggressive # aggressively collect garbage; may take a lot of time on large repos

NOTE that old tags will still remain present;请注意,旧标签仍然存在; so you might need to remove them manually所以你可能需要手动删除它们

remark: I know this is almost the same aswer as @yoyodin, but there are some important extra commands and informations here.备注:我知道这与@yoyodin 几乎相同,但这里有一些重要的额外命令和信息。 I tried to edit the answer, but since it is a substantial change to @yoyodin's answer, my edit was rejected, so here's the information!我试图编辑答案,但由于这是对@yoyodin 答案的重大更改,我的编辑被拒绝了,所以这里是信息!

Try this method How to truncate git history :试试这个方法如何截断git历史

#!/bin/bash
git checkout --orphan temp $1
git commit -m "Truncated history"
git rebase --onto temp $1 master
git branch -D temp

Here $1 is SHA-1 of the commit you want to keep and the script will create new branch that contains all commits between $1 and master and all the older history is dropped.这里$1是您要保留的提交的 SHA-1,脚本将创建新分支,其中包含$1master之间的所有提交,并且所有旧的历史记录都将被删除。 Note that this simple script assumes that you do not have existing branch called temp .请注意,这个简单的脚本假定您没有名为temp现有分支。 Also note that this script does not clear the git data for old history.另请注意,此脚本不会清除旧历史的 git 数据。 Run git gc --prune=all && git repack -a -f -F -d after you've verified that you truly want to lose all history.在确认您确实想丢失所有历史记录后,运行git gc --prune=all && git repack -a -f -F -d You may also need rebase --preserve-merges but be warned that the git implementation of that feature is not perfect.您可能还需要rebase --preserve-merges但请注意该功能的 git 实现并不完美。 Inspect the results manually if you use that.如果您使用它,请手动检查结果。

As an alternative to rewriting history, consider using git replace as in this article from the Pro Git book .作为重写历史的替代方法,请考虑使用git replacePro Git book中的这篇文章中所述 The example discussed involves replacing a parent commit to simulate the beginning of a tree, while still keeping the full history as a separate branch for safekeeping.所讨论的示例涉及替换父提交以模拟树的开始,同时仍将完整历史记录作为单独的分支进行保管。

If you want to keep the upstream repository with full history , but local smaller checkouts, do a shallow clone with git clone --depth=1 [repo] .如果您想保留具有完整历史记录上游存储库,但要保留本地较小的结帐,请使用git clone --depth=1 [repo]进行浅克隆。

After pushing a commit, you can do推送提交后,您可以执行

  1. git fetch --depth=1 to prune the old commits. git fetch --depth=1修剪旧提交。 This makes the old commits and their objects unreachable.这使得旧的提交及其对象无法访问。
  2. git reflog expire --expire-unreachable=now --all . git reflog expire --expire-unreachable=now --all To expire all old commits and their objects使所有旧提交及其对象过期
  3. git gc --aggressive --prune=all to remove the old objects git gc --aggressive --prune=all删除旧对象

See also How to remove local git history after a commit?另请参阅如何在提交后删除本地 git 历史记录? . .

Note that you cannot push this "shallow" repository to somewhere else: "shallow update not allowed".请注意,您不能将此“浅”存储库推送到其他地方:“不允许浅更新”。 See Remote rejected (shallow update not allowed) after changing Git remote URL .请参阅更改 Git 远程 URL 后远程被拒绝(不允许浅更新) If you want to to that, you have to stick with grafting.如果你想这样做,你必须坚持嫁接。

I needed to read several answers and some other info to understand what I was doing.我需要阅读几个答案和一些其他信息来了解我在做什么。

1. Ignore everything older than a certain commit 1. 忽略比某个提交更早的所有内容

The file .git/info/grafts can define fake parents for a commit.文件.git/info/grafts可以为提交定义假父母。 A line with just a commit id, says that the commit doesn't have a parent.只有提交 ID 的一行表示提交没有父级。 If we wanted to say that we care only about the last 2000 commits, we can type:如果我们想说我们只关心最近的 2000 次提交,我们可以输入:

git rev-parse HEAD~2000 > .git/info/grafts

git rev-parse gives us the commit id of the 2000th parent of the current commit. git rev-parse 为我们提供了当前提交的第 2000 个父项的提交 ID。 The above command will overwrite the grafts file if present.如果存在,上述命令将覆盖嫁接文件。 Check if it's there first.首先检查它是否在那里。

2. Rewrite the Git history (optional) 2.重写Git历史(可选)

If you want to make this grafted fake parent a real one, then run:如果你想让这个嫁接的假父母成为真正的父母,那么运行:

git filter-branch -- --all

It will change all commit ids.它将更改所有提交 ID。 Every copy of this repository needs to be updated forcefully.此存储库的每个副本都需要强制更新。

3. Clean up disk space 3.清理磁盘空间

I didn't done step 2, because I wanted my copy to stay compatible with the upstream.我没有完成第 2 步,因为我希望我的副本与上游保持兼容。 I just wanted to save some disk space.我只是想节省一些磁盘空间。 In order to forget all the old commits:为了忘记所有旧的提交:

git prune
git gc

Alternative: shallow copies替代方案:浅拷贝

If you have a shallow copy of another repository and just want to save some disk space, you can update .git/shallow .如果您有另一个存储库的浅拷贝并且只想节省一些磁盘空间,您可以更新.git/shallow But be careful that nothing is pointing at a commit from before.但是要小心,没有任何东西指向之前的提交。 So you could run something like this:所以你可以运行这样的东西:

git fetch --prune
git rev-parse HEAD~2000 > .git/shallow
git prune
git gc

The entry in shallow works like a graft.浅层的进入就像嫁接一样。 But be careful not to use grafts and shallow at the same time.但注意不要同时使用嫁接和浅层。 At least, don't have the same entries in there, it will fail.至少,那里没有相同的条目,它会失败。

If you still have some old references (tags, branches, remote heads) that point to older commits, they won't be cleaned up and you won't save more disk space.如果您仍有一些指向旧提交的旧引用(标签、分支、远程头),它们将不会被清除,您也不会节省更多磁盘空间。

When rebase or push to head/master this error may occurredrebasepush to head/master 时可能会发生此错误

remote: GitLab: You are not allowed to access some of the refs!
To git@giturl:main/xyz.git
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'git@giturl:main/xyz.git'

To resolve this issue in git dashboard should remove master branch from "Protected branches"要在 git 仪表板中解决此问题,应从“受保护的分支”中删除主分支

在此处输入图片说明

then you can run this command然后你可以运行这个命令

git push -f origin master

or要么

git rebase --onto temp $1 master

There are too many answers here which are not current and some don't fully explain the consequences.这里有太多不是最新的答案,有些没有完全解释后果。 Here's what worked for me for trimming down the history using latest git 2.26:以下是使用最新的 git 2.26 修剪历史记录对我有用的方法:

First create a dummy commit.首先创建一个虚拟提交。 This commit will appear as the first commit in your truncated repo.此提交将显示为您截断的存储库中的第一个提交。 You need this because this commit will hold all base files for the history you are keeping.您需要这个,因为此提交将保存您保留的历史记录的所有基本文件。 The SHA is the ID of the previous commit of the commit you want to keep (in this example, 8365366 ). SHA 是您要保留的提交的前一个提交的 ID(在本例中为8365366 )。 The string 'Initial' will show up as commit message of the first commit.字符串 'Initial' 将显示为第一次提交的提交消息。 If you are using Windows, type below command from Git Bash command prompt.如果您使用的是 Windows,请从 Git Bash 命令提示符键入以下命令。

# 8365366 is id of parent commit after which you want to preserve history
echo 'Initial' | git commit-tree 8365366^{tree}

Above command will print SHA, for example, d10f7503bc1ec9d367da15b540887730db862023 .上面的命令将打印 SHA,例如, d10f7503bc1ec9d367da15b540887730db862023

Now just type:现在只需输入:

# d10f750 is commit ID from previous command
git rebase --onto d10f750 8365366

This will first put all files as-of commit 8365366 in to the dummy commit d10f750 .这将首先将提交8365366所有文件放入虚拟提交d10f750 Then it will play back all commits after 8365366 over the top of d10f750 .然后,它会8365366回放所有提交过顶d10f750 Finally master branch pointer will be updated to last commit played back.最后master分支指针将更新为最后一次提交回放。

Now if you want to push these truncated repo, just do git push -f .现在,如果您想推送这些截断的 repo,只需执行git push -f

Few things to keep in mind (these applies to other methods as well as this one): Tags are not transferred.需要记住的几件事(这些也适用于其他方法以及本方法): 标签不会被转移。 While commit IDs and timestamps are preserved, you will see GitHub show these commits in lumpsum heading like Commits on XY date .在保留提交 ID 和时间戳的同时,您将看到 GitHub 将这些提交显示在像 Commits Commits on XY date这样的一次性标题中。

Fortunately it is possible to keep truncated history as "archive" and later you can join back trimmed repo with archive repo.幸运的是,可以将截断的历史记录保留为“存档”,稍后您可以将修剪过的 repo 与存档 repo 结合起来。 For doing this, see this guide .为此,请参阅本指南

For existing repository cloned previously with --depth对于先前使用--depth克隆的现有存储库

git clone --depth=1 ...

Just do做就是了

git pull --depth=1 --update-shallow

https://git-scm.com/docs/git-pull https://git-scm.com/docs/git-pull

In my case I want to split a repo in two, keep history but clean up the log history from files filtered out the new repo.在我的情况下,我想将一个存储库一分为二,保留历史记录,但从过滤掉新存储库的文件中清理日志历史记录。

This was the solution:这是解决方案:

PATHS=path_a path_b
git filter-branch -f --prune-empty --index-filter "git read-tree --empty                                                                                    
git reset \$GIT_COMMIT -- $PATHS " -- --all -- $PATHS

This way I got a new repo with the full commit log history, but only for the path I wanted to keep;通过这种方式,我得到了一个包含完整提交日志历史记录的新仓库,但仅限于我想要保留的路径;

Ref: https://stackoverflow.com/a/56334887/2397613参考: https://stackoverflow.com/a/56334887/2397613

According to the Git repo of the BFG tool, it "removes large or troublesome blobs as git-filter-branch does, but faster - and is written in Scala".根据 BFG 工具的 Git 存储库,它“像 git-filter-branch 一样删除大的或麻烦的 blob,但速度更快——并且是用 Scala 编写的”。

https://github.com/rtyley/bfg-repo-cleaner https://github.com/rtyley/bfg-repo-cleaner

  1. remove git data, rm .git删除 git 数据,rm .git
  2. git init git初始化
  3. add a git remote添加一个 git 遥控器
  4. force push强推

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将历史记录添加到Git存储库? - How do I prepend history to a Git repository? 如何从错误的svn迁移中合并git历史记录-旧存储库 - How to combine git history from wrong svn migration - old repository 如何将具有历史记录的 SVN 存储库迁移到新的 Git 存储库? - How do I migrate an SVN repository with history to a new Git repository? 在 git 存储库中,如何在不丢失 Git 历史记录的情况下删除旧的重命名“大写”文件夹 - In a git repository, how to remove old renamed“capitalization” folders without losing Git history 我无法从存储库历史记录中删除 Git 大文件 - I can't remove Git Large File from repository history 如何从 Git 存储库的提交历史中移除/删除大文件? - How to remove/delete a large file from commit history in the Git repository? 从历史中完全删除(旧)git提交 - Completely remove (old) git commits from history Git - 如何从提交历史中删除大文件,以便我可以推送存储库? - Git - How to remove a large file from commit history so I can push repository? 如何从Git存储库中删除文件,包括历史记录和重命名? - How can I remove a file from a Git repository, including history, and following renames? 如何从 Git 存储库中删除目录? - How do I remove a directory from a Git repository?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM