简体   繁体   中英

Git: Insert existing commits from one branch into another branch's history

I want to insert several commits from one branch into the history of another, so that:

A - B - F - G - H - I - J  (branch working)
    \
      C - D - E            (old branch)

becomes...

A - B - C - D - E - F - G - H - I - J (continue with branch working)

Anything I've tried so far has resulted in multiple conflicts preventing me from continuing, but I don't care about them as long as the state of the subsequent commits is the same. Is the problem that commit F "doesn't know" how to become the child of commit E?

You literally can't do that , because that would involve changing some existing commit. In this case, existing commit F stores B 's hash ID as its parent. No existing commit can ever be changed: commits are entirely read-only (in fact, all of Git's internal objects are read-only).

I assume you want to pretend that F stores E 's hash ID as its parent instead, without changing the snapshot associated with any of the existing commits. You can do this , and you can do a number of similar things. Before we look at those, though, let's look at git rebase .

The problem with using git rebase here is that rebase functions by copying commits (which is all well and good so far), with each copy made as if 1 by using git cherry-pick (which is where things start to go wrong). To cherry-pick one single commit, Git will:

  • Compare the commit's snapshot vs the commit's parent's snapshot, eg, J vs I , or F vs B . You can do this yourself by running git diff <hash1> <hash2> , or more simply by git show <hash2> .

  • Merge the diff shown here with the diff from the parent (the <hash1> above) to the current or HEAD commit, and use the merge result to make a new commit.

  • Copy the original commit's message to the new commit.

The new commit, as always, goes onto the current branch, making the branch one commit longer and changing the HEAD commit to name the new commit.

The merge step, plus the fact that Git is converting snapshots (commits) to changesets (diffs), mean that the final cherry-picked result can be a very different snapshot, as compared to the original. This all depends on how different <hash1> is from whatever is HEAD when you start the process.

The git rebase command essentially automates the process of cherry-picking a whole series of commits at once. At the end of all the copying, git rebase forces the branch label to point to the final copied commit. This would allow you to copy F to a new F' whose parent is E , then copy G to a new G' whose parent is F' , and so on:

A--B--F--G--H--I--J   <-- (original)
    \
     C--D--E   <-- (target of rebase)
            \
             F'-G'-H'-I'-J'   <-- branch

1 Sometimes git rebase literally runs git cherry-pick , and sometimes it uses a method that should usually produce the same result, but in some odd corner cases, doesn't.


Now, suppose that instead of all this, we make a Git object that Git can "look aside" to, whenever it's about to use F . This replacement-for- F goes into the graph like this:

A--B--F--G--H--I--J   <-- branch
    \
     C--D--E   <-- some_name
            \
             Frepl   <-- refs/replace/<hash>

We copy the commit message for F to that for Frepl , and copy most of the other fields as well, including the internal Git hash ID for the tree object for the commit. But instead of pointing to commit B , commit Frepl points to commit E .

Now we just need Git to "turn its eyes" from F to Frepl every time it's about to work with commit F for any reason. And there is a way to do that: we give Frepl a special name, refs/replace/ big-ugly-hash-id , where big-ugly-hash-id is the actual hash ID of commit F .

The Git command that makes the replacement is git replace . The looking-aside is automatic: Git always does it, unless we run git --no-replace-objects . This is all done without changing any existing Git object, so it's just adding to the repository.

The biggest drawback to a replacement object is that git clone by default does not copy the replacement objects (it does not fetch them, nor their names). This means that the clone does not have the replacement in view, and never looks aside to it. You can explicitly add the replacements to your fetch refspecs to get them, but it's a bit of a pain. The git push operations don't transfer them by default either.

The big advantage is that they do not require everyone to stop using the original commits in favor of new-and-improved commits. If you can get everyone to switch over, though, you can use git rebase to copy many commits and move the branch label. Or, you can use git replace initially to make the replacement object, then run git filter-branch with no filters, but telling it to filter the branch(es) on which the replacement(s) occur.

What git filter-branch does is to copy every commit (on the branch or branches it's told to filter). It—at least logically; there's a lot of optimization—extracts each commit, working from oldest commits to newest, to a temporary directory, applies every filter in some sequence, then makes a new commit using whatever the filters did. If the new commit is bit-for-bit identical to the original, the two commits are actually just one commit, otherwise the copy is a new and different commit. The default parent(s) for each new copy are the commits made in the earlier copy of the parent(s) (though there is a --parent-filter to let you change that!). It does all of this with the replacement lookaside gimmick happening, 2 so when Git goes to copy F , it actually copies Frepl instead, and then follows Frepl back to E . If we have this filter the name branch , the result is that Git copies A , then B , then C , then D , then E , then Frepl , then G , then H , then I , and then J , giving:

A--B--F--G--H--I--J   <-- refs/original/refs/heads/branch
    \
     C--D--E   <-- some_name
            \
             Frepl   <-- refs/replace/<hash>
                \
                 G'-H'-I'-J'   <-- branch

Note that branch points to commit J' , whose parent is I' but whose tree (snapshot) is the same as that of J . Commit I' points back to H' , which points back to G' , which points back to Frepl (the copy we made back when we ran git replace : it stayed unchanged during the filtering). This points back to E (which is its own unchanged copy), which points back to D , and so on back to A .

The final effect, then, if we toss out all the refs/original/ names like refs/original/refs/heads/branch , is to "cement in place" any replacement commits. We can then delete the refs/replace/ name for Frepl and it looks as if we only ever had commits ABCDE-Frepl-G'-H'-I'-J' in the repository. The hash IDs of many commits have changed, so this repository is no longer compatible with the original repository; but if we start from the name branch , we see only the shiny new copied commits in all the right places.


2 Of course, if you run git --no-replace-objects filter-branch , that disables the replacement lookaside. There's probably never any reason to do this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM