简体   繁体   中英

How do I import a commit into a branch

I've a git commit which happened in some branch, and now the master does not have it. How do I import a commit to master?

Would prefer using merge as need to go through pull request.

Git does not have pull requests. 1 Git Hub has pull requests, as do other web-hosting providers that let you use Git as a service. There are several underlying Git mechanisms that these services build a fancy, more-user-friendly interface for.

The one you have asked about specifically is git merge . It is possible to use this command—and/or the fancy Web interfaces that GitHub and others provide—without understanding what's going on. I don't think this is a good idea, myself. However, to understand what really is going on, there are a few basic concepts that one needs to "get" first.

One of these has to do with the meaning of the term branch itself: see What exactly do we mean by "branch"? When someone says "a branch", sometimes they mean a branch name —or one of Git's various other names that aren't quite branch names—and sometimes they mean a vaguely-defined series of commits .

Again, none of this is required to use the pull request interface on GitHub. To use that, you just click some web buttons. But it's a good idea—I think, at least—to know what's going to happen with these buttons, and for that, you do need to understand how Git works. So let's dive into the details.


1 By this, I mean there is no git pull-request command. There is a git request-pull command, and what it does is generate an email message. That's the extent to which Git supports pull requests: it has a comand to generate an email message, asking someone else to do something.


Commits

A Git commit is a snapshot of all of your files as of the state they had when you made the snapshot, plus a bit of metadata describing the snapshot. (Technically, the commit represents the snapshot indirectly, as the snapshot is stored separately as a tree object, but in most cases you don't need to know this: there's a one-way link from commit to snapshot, so that given the commit Git can always find the snapshot. The linkage means that multiple different commits can represent the same snapshot without taking the space twice, though, which is useful for several purposes.)

Each commit is identified by a hash ID like b5101f929789889c2e536d915698f58d5c5c6b7a . These things are big and ugly and impossible for humans to deal with, but they are how Git finds the commits, so they are crucial for Git's operation. The hash ID for any one particular commit is always unique: that hash ID is that commit and no other, ever; every other commit has a different hash ID. Moreover, all Gits everywhere in the universe agree on these hash ID computations. Given two different Git repositories, if they both have some commit ID H —that is, they both have a commit object whose hash is H —the contents of that object must necessarily be the same. 2

The metadata in the commit includes your name (or the name of the person who made the commit) and email address, and a time-stamp of when the commit was made. It includes your log message, telling everyone why you made this commit. But it also includes the hash ID of the commit that comes immediately before this one. We call that the commit's parent . The result is a backwards-looking chain. If we have a tiny repository with just three commits, we can draw it like this:

A <-B <-C

Here, C is the last commit. It has some unique, big ugly hash ID. It also stores the unique hash ID of commit B , so once we find C , we can use it to find B . Meanwhile B has the hash ID of A , so we can use B to find A . Since A is the very first commit, it has no parent at all—technically it's a root commit—and that lets us stop going backwards.

The other thing to know about commits—and all Git objects that have hash IDs—is that you cannot change any of them, ever. The reason is that the hash ID is a cryptographic checksum of the contents of the object. If you were to take a commit object and change even one bit—such as fixing the spelling of a word in your log message—you'd end up with a new and different commit , with a different hash ID. So, once a commit is made, it's forever: that hash ID is taken now. 3

What this means for us is that we don't need to draw the arrows coming out of a commit as arrows. Once the commit exists, it's permanent, and its linkages to its parents are permanent too. We just need to remember that they only go one way: backwards. New links can appear to this commit, but they can't appear going from this commit to anywhere new.


2 Note that the requirement that identical hash IDs represent the same underlying object content is only maintained across two Gits that meet and exchange objects. Two repositories that are never connected can have such doppelänger commits, as long as they never try to talk to each other.

3 You can delete a commit entirely, if a bit painfully, by just having nothing refer to it any more. Eventually the underlying Git object goes away, effectively releasing the hash ID. Since there are 160 bits in the current hash ID system, there are only 2 160 possible objects in any Git repository. Fortunately, that's plenty. Still, the pigeonhole principle combined with the birthday paradox makes for some interesting theoretical issues, and How does the newly found SHA-1 collision affect Git? has discussion about that.


Branch names

Given the above repository, we can find all the commits if we know the hash ID of commit C . Where will we store that? How about: in a branch name? Let's pick a name like master and use that to scribble down the hash ID of C :

A--B--C   <-- master

Now let's make a new commit, by checking out commit C and doing some work, in the usual way:

git checkout master
... do some work ...
git add ... various files ...
git commit

The new commit will package up a new snapshot, add whatever log message we provide, add our name and email address and the time-stamp, and, crucially, set the parent of our new commit to be commit C :

A--B--C--D

As the last step of committing, git commit will take the new commit's hash ID—whatever the actual checksum is, now that all the parts are cemented forever into place—and *write that checksum down in the name master :

A--B--C--D   <-- master

So that's what a branch name is: it's a place to store the hash ID of the last commit. Mere humans don't have to remember hash IDs, because we have Git remember them for us. We remember only that master holds the hash ID of the latest commit, and have Git do the rest.

This is where your HEAD comes in

Of course, you can create multiple branch names. Each one just points to one particular commit. Let's make a new branch dev right now:

A--B--C--D   <-- master, dev

Note that master and dev both point to commit D , and all four commits are on both branches . But Git needs a way to know which name to change when we make a new commit. This is where the special name HEAD comes in. We have Git attach this name to one (and only one) branch name:

A--B--C--D   <-- master (HEAD), dev

or:

A--B--C--D   <-- master, dev (HEAD)

We do this using git checkout , which not only checks out the commit, but also attaches HEAD . If HEAD is attached to master and we make a new commit E it looks like this:

           E   <-- master (HEAD)
          /
A--B--C--D   <-- dev

If we now switch HEAD to dev (by doing git checkout dev ) and make a new commit F , we get:

           E   <-- master
          /
A--B--C--D
          \
           F   <-- dev (HEAD)

This is where merges come in

Let's say we have a repository with a bunch of commits where the last few look like this:

       I--J   <-- br1 (HEAD)
      /
...--H
      \
       K--L   <-- br2

Here, we have some series of commits ending in one whose hash is H , with H holding some snapshot. Then someone—maybe us—made two more commits I and J on branch br1 , that we're on now. Someone—maybe us, maybe someone else—started from H and made two more commits K and L . This is true even if the someone else made K and L in a different repository. We both have H , and since all Gits everywhere agree on the hash ID computation, we both started from the same commit .

What the git merge command will do is to figure out what we changed in our branch br1 and what they changed in their branch br2 . It will then combine these changes. But we already noted that the word branch tends to be vague and ill-defined. What Git will really do here is find the common commit , the one that we both started from. We already saw that this is commit H .

Commit H is therefore the merge base of the merge operation. The other two interesting commits are simply the one named by our current branch—commit J at the tip of br1 —and the one named by the other branch, commit L at the tip of br2 . All three commits are snapshots , so Git needs to compare them:

  • git diff --find-renames hash-of-H hash-of-J finds what we did in br1
  • git diff --find-renames hash-of-H hash-of-L finds what they did in br2

Git can now combine these two sets of changes . If we changed some file and they didn't, Git should take our new file. If they changed some file and we didn't, Git should take their new file. If we both changed the same file, Git should start with the copy of the file from the merge base commit H itself, combine the two different changes, and apply the combined changes to that file.

So that's what git merge does, in this case: it combines our changes and their changes, which produces a new merged snapshot. This process of combining changes is what I like to call merge as a verb , or to merge . It's important to remember that this can be done by other commands, because other Git commands do it! This to merge or merge as a verb uses Git's merge engine to combine work.

In this case, though, git merge now goes on to make a merge commit . This is almost just an ordinary commit: it has a snapshot, and a log message, and so on, just like any other commit. What makes it special—a merge commit—is that it has two parent commits instead of just the usual one. The first parent is the same as usual: it's the commit we have checked out when we run git merge . The second parent is just the other commit—the one we picked out by using the name br2 , or in this case, commit L .

So now git merge makes a merge (merge as a noun) or a merge commit (merge as an adjective), which looks like this:

       I--J
      /    \
...--H      M
      \    /
       K--L   <-- br2

What happens to our branch name? The same thing as always, of course. Git writes the new hash ID for this new merge commit M into the current branch name:

       I--J
      /    \
...--H      M   <-- br1 (HEAD)
      \    /
       K--L   <-- br2

That's how we will merge someone's commits—the someone being whoever made K and L on br2 , in this case.

(Note that, in general, we get the same snapshot if we git checkout br2; git merge br1 . The merge base is still H and the two tips are L and J , and combining work produces the same result. What changes is that the first-parent of this other merge would be L , not J , so the parents get swapped, and the final name-update would update the name br2 rather than br1 . If we start throwing in extra merge options, though, like -X ours or -X theirs , more things can be different.)

Not all git merge commands result in merges

It's worth noting an extra wrinkle or two here. Suppose we have this graph:

...--A--B--C--D--E--H--I   <-- branch1 (HEAD)
            \      /
             F----G   <-- branch2

and we run git merge branch2 . We've already merged branch2 earlier at commit H , which has parents E and G . The merge base is defined (loosely—technically it's the Lowest Common Ancestor in the DAG) as the nearest commit that's on both branches, and that's commit G , since from I we can walk backwards to H and then G , and of course from G we just stay right there at G .

In this case, git merge branch2 will say already up to date and do nothing. That's correct: their commit is G , and ours is I which already has G as an ancestor (a grandparent, in this case) so there is no new work to combine.

We can also have this related situation:

...--A--B--C--D--E--H--I   <-- branch1
            \      /
             F----G   <-- branch2 (HEAD)

where we run git merge branch1 . This time our commit is G and theirs is I . The merge base is still commit G as before. What Git does by default in this situation is to say to itself: The result of diffing G against G is, by definition, empty. The result of combining nothing with something is, by definition, the something. So all I really have to do is git checkout hash-of-I . So I'll do that, but at the same time, make the name branch2 point to commit I too. The result is:

...--A--B--C--D--E--H--I   <-- branch1, branch2 (HEAD)
            \      /
             F----G

Git calls this a fast-forward operation . Git sometimes calls it a fast-forward merge , which is not good terminology as there is no actual merging involved.

You can force Git to make a real merge—to diff G against itself, and combine nothing with something, and then make a real merge commit—giving:

...--A--B--C--D--E--H--I   <-- branch1
            \      /    \
             F----G------J   <-- branch2 (HEAD)

To force a real merge here, use git merge --no-ff branch1 .

(Sometimes you want or need a real merge, and sometimes fast-forward-instead is OK. For what it's worth, the clicky buttons on the GitHub web hosting interface do not permit or perform fast-forward merges, even if you want them to. In effect, they always use git merge --no-ff .)

How all this relates to pull requests

Pull requests, or even Git's more-primitive git request-pull option, are only useful if there is more than one Git repository involved in the process .

In this case we might have, in Repository #1, a series of commits:

       I--J   <-- master
      /
...--H 

Meanwhile, over in Repository #2, we have:

...--H
      \
       K--L   <-- master

Since these are two different repositories, they have their own private branch names . One has its master holding hash ID I . The other has its master holding hash ID L . Commit H is in both repositories, while commits IJ are only in #1, and KL are only in #2.

If we were to somehow combine the two repositories, while changing the names so that they don't collide, we'd be back in our regular merge situation:

       I--J   <-- master of Repository #1
      /
...--H
      \
       K--L   <-- master of Repository #2

This is precisely what GitHub does with its clicky web interface. Whoever you are—#1 or #2; let's pick #2 for concreteness—you tell GitHub: I'd like them to merge my master, ie, commit L. GitHub then copies your commits—bit-for-bit so that their hash IDs remain the same—to their repository, putting the hash ID of commit L under a special name that's not master and not any other branch name. 4 Then they, GitHub, run a git merge , using this same sort of special name that's not a branch name at all. If that all works, then they tell whoever controls repository #1 that there's a pull request from you.

Whoever controls repository #1 can now click the "merge pull request" button. That takes the merge that GitHub already did 5 and moves their master , or whatever branch name, in their GitHub repository, appropriately:

       I--J
      /    \
...--H      M   <-- master
      \    /
       K--L

Commits K and L now appear in their repository, reachable by following the second-parent backwards link from master .

What this means for you, as someone who wants to make a pull request, is that you have to arrange for your repository on GitHub to have a commit or chain of commits that GitHub will be able to test-merge for you. GitHub will then present the request to the owner of that repository, and that owner will be able to just do one click to complete the merge by updating their branch's name to use the test merge GitHub has made.

The commits, and the test-merge result, are determined by what commits you put into your GitHub repository. If you have your own separate repository on your own machine locally, you can put commits into it and use git push to send those commits to your GitHub repository.

This is, obviously, all a bit convoluted—but if you keep your local machine's repository and your own GitHub repository in sync, so that they always "look the same" as it were, you get to ignore the extra layer here. The issue with ignoring this layer is that it is still there! If you let your repository and your GitHub repository get out of sync, this shows right up again.


4 When you make a pull request, GitHub allocates a unique number for it (unique to the destination repository). Say this is Pull Request #123. The name that GitHub uses for your commits, once they are copied into their GitHub repo, is refs/pull/123/head . The name that GitHub uses for the test merge it makes is refs/pull/123/merge . If the test merge fails with a conflict, GitHub doesn't make one after all and does not create the second name.

5 If whoever controls the PR-target repository pushes new commits to their branch, the test merge that GitHub made becomes invalid (it's "out of date"). GitHub will make a new test merge if and when it's appropriate. I'm not sure if they delete the refs/pull/123/merge name in between as I have never tested that.

How to do this by Pull Request
(but avoid merging into master any other commits from the feature branch, like a (feature-branch >> master) pull request would cause, you can do the following)

1) Create a new branch named hotfix/master (for example) from master directly on the remote (github? bitbucket? other? you didn't mention but feel free to comment and I'll adjust) then from your local, get the last refs from remote :

git fetch

2) Here, fetch output should contain your newly created branch, so let's create its local counterpart :

git checkout hotfix/master

3) Now we import the commit you want on it :

git cherry-pick <commitHash>

or alternatively, if the commit you want is the last on the feature branch, just :

git cherry-pick <branchName>

4) Resolve any conflict that may arise at this level, and push the branch to remote :

git push origin hotfix/master

5) Finally, back to the remote interface where you create your pull requests, and create one between hotfix/master and master .

您可以cherry-pick它:

$ git cherry-pick <commit hash>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM