简体   繁体   中英

Cherry pick a parent merged commit from develop branch to deployment branch

Suppose I have branch develop which has a merged PR with many commits eg A , B , C , so I want to get all of the commits or If possible I get commit A suppose it's a parent with all the files from the other commits ( B and C ), and I copy them in a new branch tester-branch .

After doing so, then I raise a PR of the branch tester-branch pointing to another branch Deployment .

I tried something like this but didn't work,

git checkout tester-branch
git cherry-pick -m 1 commit-A
git add .
git commit -M "message"
git push branch-name

The closest I reached, it was pulling other commits from the develop branch besides A , B , and C which I don't want, I strictly want the exact commits either only A since its the parent or getting all of them together like A , B , C and I combine them together.

So in summary the question is, how can one get a specific commit from one merged PR, into a new branch ?

TL;DR

Simply cherry-pick the commit of interest , not the merge commit.

Long

A merge commit is a commit. Hence, like any commit, it has a snapshot. The only thing that is different about a merge commit is that it has more than one parent commit. No commit stores changes; all commits store snapshots .

The cherry-pick command works, in a sort of fundamental "what we desire to have happen" way, by comparing a commit's snapshot to its parent commit's snapshot . That is, suppose we have a simple, linear chain of commits, with newer commits on the right, which we might draw like this:

... <-F <-G <-H ...

Each commit is actually addressed by some big random-looking hash ID. Like trying to get around in Tokyo, nothing is numbered sequentially; you need a map. In Git's case, the map says that if you want to move on from H , the next commit in the backwards direction is commit G . Commit H literally stores commit G 's random-looking hash ID. 1 We say that H points to G . G , in turn, stores the hash ID of an earlier commit, which we'll call F for convenient drawing purposes, so G points backwards to F . F , being a commit, will point backwards yet again, and so on.

Each commit also has a full snapshot of every file . So if we are using commit H and want to see "what changed", we have Git find—from H —the hash ID of earlier commit G . Then we have Git extract both snapshots and compare them . Some files are exactly the same in both commits. 2 Others aren't, and if we want an actual git diff showing what changed, Git then has to figure that out, right now, on demand, to show it to us.

(This is why you can run git diff or git log -p or git show with different options, to show the diff in different ways. Each time you run git diff or similar, Git re-computes the changes.)

So cherry-pick needs to run a diff, from the commit to its parent (singular). This tells us what changed: which files are different, and—via the output of git diff —this gives us a recipe: add a line here, delete a line there, and pretty soon what was in the old (parent) commit is now what is in the new (child) commit.


1 Note that there's no way to go forwards , because once you make commit G , it is set in stone. Nothing—not even Git itself—can change it. We don't know what the hash ID of future commit H will be, because the hash ID of a commit depends on every single bit of data in the commit: the source snapshot (which we don't know yet), your name and email address (which you might change), the date-and-time of exactly when you will make the next commit, and so on.

The only thing we do know about some future commit is that it will get a big, ugly, random-looking but unique hash ID. It will get that hash ID by the act of writing it out, setting it in stone at that time. So we'll be able to store, in that future commit, the existing hash IDs of existing commits, but we won't know what the future hash IDs of future commits will be, so we can't store those. And then, once we've made that future commit, it's stuck that way: we can't add hash IDs to it! So all commits, in Git, always point backwards.

2 Git de-duplicates these automatically, which makes it very easy for Git to tell that the two files are identical. This in turn avoids having to do any of the extraction work, so comparing a commit to another commit is really only a matter of comparing the already-known-to-be-different files. The already-known-to-be-the-same files get eliminated almost instantly.


This is why cherry-picking a merge requires the -m flag

A merge commit , as we already noted, is a commit with more than one parent . We get a merge commit when we run git merge . 3 That is, we start out with, say:

          I--J   <-- branch1 (HEAD)
         /
...--G--H
         \
          K--L   <-- branch2

where we're on branch branch1 (that's the attached HEAD name), and we run:

git merge branch2

Git now finds commits J and L —our current commit and the commit we'd like to merge—and uses the backwards linkage to find the best starting-point shared commit , which in this case is obviously 4 commit H .

Git figures out what we changed by doing a diff from H to J , and what they changed by doing a diff from H to L . That's the same idea as doing a diff from a parent commit to its child: we just leap over several commits at a time this way. Then git merge combines these two sets of changes and applies those combined changes to the snapshot in H .

This has the effect of adding, to our snapshot in J , the changes they made to get to L . Or, from their point of view, it has the effect of adding, to their snapshot in L , the changes we made to get to J . That happens because this adding-up of changes keeps our changes but adds theirs , or, from their point of view, keeps their changes but adds ours . 5

The final merge result is a new commit M that, instead of one single parent J , has two parents:

          I--J
         /    \₁
...--G--H      M   <-- branch1 (HEAD)
         \    /²
          K--L   <-- branch2

The first parent, or for cherry-pick, -m 1 , is J . The second parent— -m 2 —is L .


3 More precisely, we sometimes get a merge commit from git merge . Some git merge operations don't produce merge commits at all, but in that case, there's no problem for cherry-pick, right? 😀

4 If it isn't obvious, you just haven't been using a version control system for very long. The non-obvious merge bases are in much more tangled graphs. See some examples in Pretty Git branch graphs to get a feel for easy vs hard cases.

5 The algebra here is pretty straightforward except when it comes to merge conflicts , or the special—but not actually all that unusual—case of a merge where we and they both fix the same problem in the same way , where Git takes just one copy of the change.


Cherry-picking a merge means "take all their changes"

Now suppose we have this:

           I--J
          /    \₁
         H      M   <-- branch1
        / \    /²
       /   K--L   <-- branch2
      /
...--G
      \
       N--O--P   <-- branch3 (HEAD)

We're currently sitting here, on commit P via branch3 , which has a snapshot and metadata like any commit. So our checked-out working tree matches commit P , assuming we haven't made any changes to it.

If we now run:

git cherry-pick -m 1 branch1

we're telling Git: Go find commit M . It's a merge, so go to its first parent, J . Diff J vs M . Figure out what changed.

So: what did change, from J to M ? Well, we ran git merge , and git merge combined our work-since- H with their work-since- H .

The diff from J to M , then, is the sum of the changes they made in commits K and L on branch2 . More precisely, it's the sum of those changes, minus anything that we already had in our commit J , plus or minus any other things we did if we had to resolve merge conflicts.

That's the diff from J vs M , so that's what we're telling Git to add to commit P , so as to make a new commit. If we want what they fixed in commit K , but not what they did in commit L , that's the wrong thing to ask for. We should instead run:

git cherry-pick <hash-of-K>

That will compare H vs K , to see what changed, and try to add those changes to our commit P .

Cherry-pick is a merge, sort of

There's one other thing to be aware of here, and that's that a git cherry-pick operation is actually implemented by Git's merge engine .

Let's take a look at a simple cherry-pick operation like this one:

          o--o--o--P--C--o--...    <-- their-branch
         /
...--o--o
         \
          o--...--o--H   <-- our-branch (HEAD)

Technically, it doesn't even matter if our branches are related at all:

...--P--C--...    <-- their-branch

...----H   <-- our-branch (HEAD)

but generally cherry-pick only makes sense if the branches have some kind of relationship. We now run:

git cherry-pick <hash-of-C>

where we get the hash ID of commit C by running git log and finding a commit that says it fixed the bug we care about, for instance.

Git now has to diff P vs C to see what they changed. That's perfectly straightforward. Maybe they changed two or three files, or just one file, or hundreds of files: whatever they changed, that's our change-of-interest.

But: that's just a diff from P to C . It says to add some line at line 1234 of file path/to/file.py or whatever. Here, in our commit H , what if path/to/file.py has that line at line 4234, or 916, because of some big changes we have that they don't or vice versa?

Git could search around for context, and some versions of some cherry-pick-like commands in some systems have done this in the past. But that's error-prone: the context-finding can misfire pretty easily. Moreover, if we've renamed the file, or if they have, so that in our source it's under new/name.py , we'd like Git to figure that out too.

To have Git figure out what we did with that file, Git needs to compare the snapshot in P to the snapshot in H . Git will now find path/to/file.py or maybe new/name.py , whichever name it has, and by doing this diff, discover that what was line 1234 is now line 4234, or line 916, or whatever.

So, Git now knows how to apply their change to our file , even if the change has moved around, and even if the change is to a different file name. But ... we've seen this exact story before, when we ran git merge :

  • Git starts with some shared starting point, and runs two git diff s.
  • Git then keeps our changes while adding their changes.

Git can simply do the same thing here, using P as the "shared starting point", our HEAD commit H as the --ours commit, and their child commit C as the --theirs commit. So that's what Git does.

The underlying merge code in Git is split into two parts:

  • The action of merging, or what I like to call merge as a verb , is the action of combining changes . This compares some starting or base commit (such as P ) to two ending-point commits (ours H and theirs C ), figures out what changed, combines the changes, and applies the combined changes to the base. That keeps our changes but adds theirs. Or, it compares base H to ours J and theirs L , figures out what changed, combines the changes, and applies the changes to the base.

  • The commit that stores a merge, or what I like to call merge as a noun , is simply a merge commit like M , with two parents. 6 Sometimes "merge" here is an adjective ("a merge commit"); English morphs a lot. As Calvin once said, "Verbing weirds language."

The git merge command (optionally) makes merge-as-a-noun-or-adjective commits. To do that, it merges (as a verb) changes. The git cherry-pick command merges changes, then makes ordinary (non-merge) commits.


6 Git allows more than two parents, in what Git calls an octopus merge . They don't do anything you can't already do with ordinary two-parent merges, though.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM