简体   繁体   中英

How do I fix a git subtree after the upstream project force pushed onto master?

I've been experimenting with using git subtree and have run into the following situation.

I used git subtree to add an external project to my repo, I intentionally kept all of the history for the upstream project as I want to be able to refer to the project's history and also contribute back to the upstream project later.

As it turns out, another contributor to the upstream project accidentally pushed a large file into the master branch. To fix this, the upstream project rewrote history and force pushed onto master. When creating my "monorepo", I included this commit and I would also like to remove it.

How can I update my repository to reflect the new history of the subtree?

My first attempt was to use filter-branch to completely remove the subtree and all history.

git filter-branch --index-filter 'git rm -rf --cached --ignore-unmatch upstream-project-dir' --prune-empty HEAD

Once the old version of the subtree was removed, I could re-add the subtree using the new upstream master. However, this didn't work because for some reason the commit history still shows up in the git log output.

Update

I've wrote up the steps to create a minimally reproducible example.

  1. First create an empty git repo.

     git init test-monorepo cd./test-monorepo
  2. Create an initial commit.

     echo hello world > README git add README git commit -m 'initial commit'
  3. Now add a subtree for an external project.

     git remote add thirdparty git@github.com:teivah/algodeck.git git fetch thirdparty git subtree add --prefix algodeck thirdparty master
  4. Make some commits on the monorepo

    echo dont panic >> algodeck/README.md git commit -a -m 'test commit'
  5. Now attempt to use git filter-branch to remove the subtree.

     git filter-branch --index-filter 'git rm -rf --cached --ignore-unmatch algodeck' --prune-empty HEAD
  6. Examine git log output, I am expecting to see only my initial commit.

     git log
  1. on your repo, cleanup the history of commits for this remote:

     git fetch upstream
  2. if one of your own commits has a commit that includes the large file, rewrite your history so that this large file is no longer referenced

    # using one or more of the following commands: git rebase --interactive git filter-branch...

With these two steps, the big file will not be referenced anymore by any commit in your repo.
It will additionally be deleted from your hard drive at some point in time, when git runs its garbage collector and the expiration delays for dangling blobs has been reached.


If you have an urgent need to delete this big file ASAP from your hard drive:

Manually run

git gc --prune=now

you already got the bad-commit in your history and you need to get rid of it before continuing

let's assume you got master last commit diverted and haven't been able to do anything else (I really don't have your branches at sight, so I need to assume something to start with)

you can checkout to the previous commit and push your branch marker 1 step back (or X steps back) which would be harmless in any case and then pull again

eg

git checkout master~1
git branch master -f
git checkout master
git pull
  1. git checkout master~1 to checkout master's parent commit, git warns we are off branches
  2. git branch master -f to force current checkout to become master again, ie it actually rewinds master branch to its previous commit (or X previous commit), and from here, it doesn't matter whether upstream did a force or not, we can resume normally, or even go back to above step if needed, we can only pull master again, without losing anything from upstream (which for us could be read-only as well, we won't be pushing anything for this)
  3. git checkout master to be on our "rewound" master branch, the same commit we are stepping at, but now being on the branch instead
  4. git pull to pull master again (can be with or without --prune ), if upstream diverted, we'll get back on track from here, if not, we'll get the same we had, if we got the same and was not supposed, perhaps we need to go back to the 1st step above and rewind more commits, eg git checkout master~5 or whatever (as needed)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM