If Branch A has 5 commits and Branch B has 7, when I merge B into A, A now has 12 commits.
Is that the expected result? I wouldve thought a merge would be considered a single commit?
When you say "branch A has five commits", you're probably not counting all the commits that branch A contains. The same applies to your seven commits in branch B. To really understand this, it's important to realize that in Git, branches—or more precisely, branch names —don't actually have any meaning. It's only the commits that matter.
To see how this works, let's start with a really tiny repository with just three commits in it. The three commits have big long ugly Git-hash-ID names, but let's just call them commits A
, B
, and C
, as if commits had single uppercase letters as their real names. (We'll run out pretty fast, which is one reason Git uses those big ugly hash IDs.)
The first big important secret of Git is that every commit stores its previous commit's hash ID inside it. Whenever you have the hash ID of a commit in your hands, we say that you're pointing to that commit. So our three commits go like this:
A <-B <-C
Commit C
stores B
's hash ID, so C
points back to B
. B
stores A
's hash ID, so B
points back to A
. A
is of course the very first commit we ever made: it can't point any further back. It's a special case—a root commit, of which there's always at least one if the repository isn't empty. Usually there's exactly one root commit, with that one being the very first commit.
The next big important secret is a simple follow-on to this first one, and that is that a branch name like master
or develop
simply points to one commit . The one commit that our master
points to, in this case, will be commit C
:
A--B--C <-- master
I always get a bit lazy about drawing the internal arrows between commits, for various reasons. One is that once we make a commit, nothing and no one—not even Git itself—can change the commit. Commit C
is frozen in time forever, always pointing back to B
, which is frozen and points to A
, and so on. The internal arrows therefore invariably point backwards . Git calls these the parents of the commit: the parent of C
is B
, and the parent of B
is A
.
The branch name pointers are different. Unlike the frozen contents of each commit, a branch name pointer can and does change.
Let's git checkout master
, which extracts commit C
into our work tree , giving us files we can see and work on / with. Then we'll make some changes, git add
the updated files, and git commit
to make a new commit that we'll call D
. Git will package up our new files 1 and make this new commit D
, pointing back to the commit we had out—ie, C
—so that we now have:
A--B--C--D
and then as its final act, git commit
writes D
's hash ID into the name master
, so that master
now points not to C
but to D
:
A--B--C--D <-- master
This is how branches grow as you add new commits: each new commit points back to the one that was the last one in the branch, and then Git updates the branch name so that the name now identifies the new tip. Whenever Git looks for the history—for what happened over time—it works by starting at the last commit, the one pointed-to by the name, and working backwards, one commit at a time.
To make a new branch , what Git does is just add a new name pointing to some existing commit. Let's make branch branch-a
now, in our four-commit repository:
A--B--C--D <-- master, branch-a (HEAD)
Besides adding the name branch-a
pointing to D
, I've attached the special name HEAD
—in all capitals, though you can use @
if you like a shorter name—to one of the two branch names. That's how Git remembers the current branch .
Before we make any new commits, answer for yourself: how many commits are there in master
, and how many are there in branch-a
? If you didn't answer "four" each time, why not? If you ask Git, the answer is four: there are four commits, D
then C
then B
then A
, on both branches.
Let's add five commits to our new branch-a
now, by changing stuff and using git add
and git commit
in the usual way. Git will construct five new, unique, big ugly hash IDs, but we'll call the new commits EFGHI
and draw them in:
E--F--G--H--I <-- branch-a (HEAD)
/
A--B--C--D <-- master
When we made E
, Git made it with parent D
, and then changed the name branch-a
to point to E
. When we made F
, its parent was E
, and Git updated branch-a
to point to F
. We repeated this five times and we have five commits on branch-a
that aren't on master, plus the four commits that are on both branches . So branch-a
has not five but rather nine commits. It's just that five of them are only on branch-a
.
Now let's make branch-b
, by first switching back to master
and then creating the new name branch-b
, pointing to commit D
:
E--F--G--H--I <-- branch-a
/
A--B--C--D <-- master, branch-b (HEAD)
Note that nothing else inside the repository itself has changed here. Our work-tree (and index) have changed—they've gone back to commit D
—and we've added a new name branch-b
that, like master
, identifies commit D
, but the commits are all undisturbed.
Now let's add seven commits that are unique to branch-b
:
E--F--G--H--I <-- branch-a
/
A--B--C--D
\
J--K--L--M--N--O--P <-- branch-b (HEAD)
There are actually 11 commits on branch-b
, but four of them are shared (with master
, which I've stopped drawing out of laziness, and with branch-a
).
Now you want to merge branch-b
into branch-a
. So the commands you run will be:
git checkout branch-a
git merge branch-b
The first step chooses commit I
as the current commit and branch-a
as the name to which HEAD
is to be attached. It copies the contents of commit I
to the work-tree (and index / staging-area). There are no changes to the graph itself, but now HEAD
indicates branch-a
and hence commit I
:
E---F----G---H----I <-- branch-a (HEAD)
/
A--B--C--D
\
J--K--L--M--N--O--P <-- branch-b
(I've also stretched out the top line a bit because of something I intend to draw in a moment. The position of the commits in the graph is stretchy because Git doesn't care about the actual time of the commit, only about the shape of the commits and their connecting arcs, and you can bend and twist the graph however you like, as long as you don't break any of the connections, or make up new ones that aren't there.)
The git merge
command then does something a little tricky. First, it finds the merge base between the current commit I
and the other commit P
. The merge base is, roughly speaking, the point where the two branches diverged. In this case that's super-obvious from the graph: it's commit D
.
Git now figures out what "we" changed on branch-a
by doing:
git diff --find-renames <hash-of-D> <hash-of-I> # what we changed
It gets a second diff to find out what they changed on branch-b
:
git diff --find-renames <hash-of-D> <hash-of-P> # what they changed
Git then combines the two sets of changes, applying the combined changes to whatever is in the snapshot in commit D
.
This "make two diffs, combine them, and apply them to the merge base" process is the action form of merging. I like to refer to this as the verb to merge , ie, to combine changes. Because commits are snapshots, not change-sets, Git has to do the two diffs. In order to have a sensible starting point, Git has to find the merge base. That's why we have all this work that happens as part of the verb to merge when we merge commits I
and P
.
Now that Git has done all this to-merge work, Git will make a merge commit . Well, it will often or usually make one—we'll see the exceptions in a moment. Note that this uses the word merge as an adjective, though, modifying the word commit . We can also refer to this new merge commit as a merge , using the word merge as a noun. I like to refer to this as merge-as-a-noun or merge-as-an-adjective, to distinguish it from the process , the to merge verb. For the git merge
command, we're doing the process first, then making the merge commit at the end. But let's draw it:
E---F----G---H----I
/ \
A--B--C--D Q <-- branch-a (HEAD)
\ /
J--K--L--M--N--O--P <-- branch-b
This new commit, merge commit Q
, is special in precisely one way: it has two parents instead of one. It points back first to commit I
, to say commit I
was at the tip of branch-a
a moment ago and is a parent of commit Q
, but then it also points back to commit P
, to say commit P
is also a parent of commit Q
.
If we now ask Git how many—and which—commits are on branch-a
, Git starts at Q
, then works backwards through both I
and P
, eventually arriving at D
(to which master
still points), and then all the way back to A
. So the number of commits is now 17: A
through D
plus E
through I
plus J
through P
plus Q
. If we ask how many commits are on branch-a
that aren't on master
, we get 13: five for E
through I
, seven for J
through P
, and one for Q
.
Here's another way to draw what happened:
...--D--E--F--G--H--I------Q <-- branch-a (HEAD)
\ /
J--K--L--M--N--O--P <-- branch-b
The number of reachable commits remains the same, though: Git starts at Q
, moves back to both I
and P
, moves back to both H
and O
, and so on until reaching D
when it moves back to whatever comes before shared commit D
.
If you have git log
draw the graph, using git log --graph
or git log --graph --oneline
, Git will draw it vertically, with commit Q
at the top and the branching structure represented as individual lines:
* hashofQ (HEAD -> branch-a) Merge ..
|\
| * hashofP commit message for P
* | hashofI commit message for I
...
or similar—the exact position of each *
and line depends on additional sorting options you may pass to git log
such as --author-date-order
, though --graph
always enforces at least the --topo-order
option. Graphical viewers such as gitk
, and various GUIs, may mimic git log --graph --oneline
but make it all prettier (though as always, beauty is in the eye of the beholder).
git merge
doesn't always merge The git merge
command can do more than build a merge (noun) using the to merge (verb) process. arkus mentioned git merge --squash
, which does the to merge part of the process, but then simply stops, without making a commit and without recording the fact that the next commit should be a merge. In this particular case, we'd then run git commit
ourselves to make commit Q
. New commit Q
would be an ordinary commit , not a merge commit, and we might draw it in like this:
...--D--E--F--G--H--I--Q <-- branch-a (HEAD)
\
J--K--L--M--N--O--P <-- branch-b
Because there is no connection between Q
and P
, someone coming in later—including yourself, or Git—and looking at this graph may have no idea that commit Q
is the result of a merge. The seven commits that are exclusive to branch-b
are still exclusive to branch-b
. In general, if you have done this, you should immediately remove the name branch-b
from this repository and from every clone of this repository , so as to utterly forget that commits JKLMNOP
ever existed.
This is sometimes, but not always, a viable, useful, and good work-flow. It's particularly useful when the individual commits on branch-b
have never been seen anywhere else, so that you know nobody else has them, and you only made them as temporary commits with the intent to replace them all with a single "add the feature" commit, ie, commit Q
, at the end. After doing the squash merge, you force Git to delete your branch-b
name and you forget that you ever did any of the individual commits. You have one final good commit and you pretend to the world that you knew how to make that commit all at once.
Sometimes, though, even if you're introducing a feature, it's good to keep it as a series of separate commits. In particular, what if you've introduced a bug too? In that case, if you shrink your feature down to a series of simple but clear commits—let's say three of them—and then you merge them with a real merge, you get a graph like this:
...--D--E--F--G--H--I--Q <-- branch-a (HEAD)
\ /
R-------S-----T <-- branch-b
If it now turns out that you have introduced a bug, it's probably possible to check out commits R
and S
and T
and see which of those commits introduced the bug . Then you can compare R
vs D
, S
vs R
, or T
vs S
, to help you find out how the bug got in, and figure out what to do to fix it.
What this boils down to is that squash merges aren't bad, they're just a tool. Use your tools to do things in a way that will make life easier for yourself in the future. If that means squashing, go ahead and squash. If not, don't.
git merge
doesn't always merge We should also cover fast-forward operations. Consider a situation in which you make a branch:
...--C--D <-- master, feature (HEAD)
You then make some commits on that branch:
...--C--D <-- master
\
E--F--G--H <-- feature (HEAD)
Everything seems great and you'd like to introduce the feature now, keeping all four of these commits intact. If you now run:
git checkout master
git merge feature
Git will say something about fast-forward , and you will be left with this graph:
...--C--D--E--F--G--H <-- master (HEAD), feature
The name feature
has not moved—it still points to commit H
—but the name master
has moved, and now also points to commit H
. There's no new merge commit!
What Git did here is that it did the merge-base finding just as it would for a real merge, and found that the best common commit between master
and feature
was commit D
. The name master
pointed to commit D
, though, so if Git were to do the usual to merge verb, it would run:
git diff --find-renames *hash-of-D* *hash-of-D* # what we changed
and the answer would, of course, be we changed nothing! Then Git would need to diff D
vs H
to find out what they changed, which of course would be whatever they changed. Git would apply those changes to D
and get ... commit H
, again.
If Git made a real merge out of this, it would look like:
...--C--D------------I <-- master (HEAD)
\ /
E--F--G--H <-- feature
The snapshot for commit I
would match that for commit H
.
You can force Git to make this merge commit:
git checkout master; git merge --no-ff feature
That way you get the same kind of true merge you would have gotten had master
had some commit after D
. You can do this if you want to emphasize to a future viewer—who may well be yourself in a year or two—that commits EFGH
were made as a group, and together they implement some feature. Or you may not care: you, and future-you a year from now, may prefer to just see commits EFGH
as a logical extension of master
, without any need to remember that these four were done specifically for some particular feature.
Again, this really boils down to the fact that fast-forward merge vs real merge is a tool, which you can use to communicate information to future users of this repository. Use your tools to arrange things to make the life of future-you easier.
f you think you'll prefer to see the merge in a git log --graph
or graphical viewer, force the non-fast-forward merge with git merge --no-ff
. If you think you'll prefer not to see the merge, you can even use git merge --ff-only
to make sure that Git will just fail if a true merge is required (after which you will need to do something different, and that's beyond the scope of this already-too-long answer).
It depends on the history of the branches... it could be between 7 revisions (a fast-forward) and 13 revisions (merging 2 totally unrelated stories). It all depends on the stories and how many diverging revisions you are talking about (or if you are forcing a --no-ff
). One possible way to get 12 revisions is to have a single common ancestor between both branches, so you have the common ancestor, 4 revisions on one branch and 6 on the other (after the common ancestor) plus the merge revision: 1 + 4 + 6 + 1 = 12. But as I said, it all depends on the history. 8 could be achieved by having all 5 revisions of one branch be the first 5 revisions of the other branch and then do merge --no-ff. That will create a merge commit for what would have been a ff. Result: 8 revisions. With 4 common ancestors and merging you get 9 revisions... and so on.
If you want to merge the branch with 1 commit, you could use --squash
option fo git merge
.
What it does is it creates one commit from the branch passed in git merge --squash <branch>
, which you can commit.
Default git merge branch
:
git merge --squash branch
:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.