简体   繁体   中英

How can I delete all commits before a given date in Git history?

Given a repository, I want to delete all commits that were before a particular commit, or a date in history.

I have around 10000 commits in my repository, and I want to only keep the last 1000 or so, and delete the rest. Basically what I want to do is to say move the first commit forward to X .

At first I thought I could just rebase and squash all of those commits into one, but that causes a lot of merge conflicts during the rebase. If there was a way to squash commits such that the version after the squash is the last commit, that'd work too.

Warning: the following is dangerous, as it rewrites history. Always make sure you have a backup of your repo before doing any kind of major history rewriting like this.

Replace the hash in the following with the hash of the parent of the commit you want to have as your new first commit.

git filter-branch --parent-filter '
    read parent
    if [ "$parent" = "-p 5bdd44e5919cb0a95a9924817529cd7c980f88b5" ]
    then
        echo
    else
        echo "$parent"
    fi'

This rewrites the parents of each commit; for most commits, it leaves them the same, but the one with the parent matching the given hash, it replaces with an empty parent, meaning it will now become a commit with no parent. This will detach all of your old history.

Note that if what you want to be your first commit is a merge commit, you'll need to match against something like -p parent1 -p parent2 -p parent3 for each of the parents of the merge commit, in the correct order.

If you want to apply this to all branches and tags instead of just the current branch, pass in --all at the end of the command (after the script).

After you have done this, and checked that it worked properly, you can delete the original branch and run a gc to clean out the now unreferenced commits:

git update-ref -d refs/original/refs/heads/master

Note that since git tends to try to preserve data, in order to actually free up the space you will also have to remove the commits from your reflog, and then run the gc to clean it up.

git reflog expire --expire-unreachable=all --all
git gc --prune=all

If you are not doing this to save space or eradicate the old commits, you can keep the old history around in a branch, such as git branch old-master refs/original/refs/heads/master ; you can even "virtually reattach" it using git replace , at which point you would have two unconnected histories (so when you push to a remote repo, you'll only push the truncated history) but when you look through history in your local repo you will see the full history.

The simpler for me is to use git replace ( edit: successfully tested!).

First squash all the commit you want into one: (we will call the sha of the last commit you want to squash and the sha of the very first commit, so your root commit)

git checkout -b big_squash <LastSha>
git reset --soft <RootSha>
git commit --amend -m "My new root"

Now, you must have your branch big_squash pointing toward a new root (called here <NewRootSha> . We are here just interested by the sha1 and the branch could be deleted in the end once you complete successfully the operation).

Then you have 2 possibilities:

  • Do a git rebase --onto of the later commits if it's easily done (that's the preferred solution of the git book but after a successful test of the other solution, that's not mine ;) )
  • Use git replace to hide the old history (history is still in the repository! But we will make it permanent with a git filter-branch )

To replace the last commit you want to squash with the newly created commit:

git replace <RootSha> <NewRootSha>

Now, you could do a git filter-branch after the git replace to make it permanent!

After your replace, do:

git filter-branch master, <put here the name of all your branches>

If the result suits you , then go delete the folder .git/refs/original (which contains all the saved refs before the git filter-branch ) and the folder .git/refs/replace (which contains the replacement that you don't need anymore).

This solution has the advantage to be simple and revertible (except the last step once you've deleted the folders ;) )

That's done!

You could find documentation here :

You can't quite get what you want, because you can't remove anything from a repository, you can only add new things to it.

To restate, but with a commit graph drawing, what you have now is (simplified):

<jumble of commits> - K - L - M - etc ...  <-- master
                        \      / (merges)  <-- etc
                        (branches)

and what you want (similarly simplified) is:

K - L - M - etc ...  <-- master
 \      / (merges)  <-- etc
 (branches)

so that K is now the root commit.

You can't get that , but you can get a new root commit that is almost exactly the same as K , with two big differences: a different SHA-1, and no parent commit ID(s). The commit would have the same tree and all the same files as commit K .

Having copied K to K' , you can then copy L to L' and so on, so that what you get is a new commit graph that has the same shape and same files and so on, just with all-new SHA-1 IDs.

The git thing that does this is filter-branch .

There are at least two ways to achieve this with filter-branch . One is to have a commit filter that:

  • skips all commits until commit K appears, then
  • copies all commits (including K itself)

(and then add the usual --tag-name-filter cat and so on). This one is slightly painful as the commit filter is not eval -ed, so you have to "remember" the skip/keep state externally (eg, in a file).

Another method is to use --parent-filter as already described by Brian Campbell .

The difference between these is that the --parent-filter method is easier but copies all the "pre- K " commits too, so that you wind up with two independent graphs in your copy. You might want this, or not; and if, after you clean out the refs/original name-space, there are no references to the "pre- K' " commits, they will be garbage-collected as usual, so that the difference goes away.

You could use a shallow clone via git clone --depth 1000 . A shallow clone still has full commit power, see https://github.com/git/git/commit/82fba2b9d39163a0c9b7a3a2f35964cbc039e1a

You can even keep the old tree around in case you still need it and it's fully compatible, no need to change history.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM