简体   繁体   中英

Feature branch is 1 commit behind Master after merging Hotfix branch to both

I am new to git branches, so this question must be silly:

I have a master branch and a feature branch. I've created a third hot-fix branch, made some changes, and merged it to master . Afterwards, I also merged it to feature , because I am doing a lot of work there and I want the bugs fixed there as well.

Now, GitHub says that my feature branch is 1 commit behind the master , where this one commit is the pull request hot-fix -> master , even though in reality both branches are identical in terms of the hot-fix changes because I also pulled them into feature

How should I be merging hot-fix branches to both master and feature without the latter being falsely behind?

You're one commit behind because there's one commit that's "on" master that is not on your branch. That's no big deal. Well, you'll find lots of opinions about what is or is not a big deal, how you should work, and so on. Some will contradict others. You can pick any plan that works and stick with it, or hopscotch around with various things that work: whatever works, works. It might be confusing to others though, which might make it stop working in a different sense.

Instead of getting all caught up in right-flow / wrong-flow arguments, let's just look at what you did and why you see the result you see. You can choose your eventual course later, once you understand the building blocks.

I have a master branch and a feature branch. I've created a third hot-fix branch, made some changes, and merged it to master . Afterwards, I also merged it to feature , because I am doing a lot of work there and I want the bugs fixed there as well.

Git isn't really about branches, and in a way, branch is a bad word—not in the sense of being explicit or foul language or something, but rather, in the sense that it fails to communicate the right meaning . The problem is that the word branch in Git has at least two different meanings , and probably more: exactly how many will depend on what and how you go about counting. (See also What exactly do we mean by "branch"? )

What Git is really about, in the end, is the commit . A repository is a collection of commits—all stored in a big database, indexed by hash IDs or more formally object IDs or OIDs—plus some other stuff, which is important, but let's concentrate on the commits first. Each commit is numbered: it gets a unique, but random-looking and very large and unwieldy, number, which Git insists on spelling in hexadecimal , as, eg, 4af7188bc97f70277d0f10d56d5373022b1fa385 . In theory, the number alone suffices to find the commit, because the number is totally unique . Any Git repository that has a commit with that number has exactly that commit . 1 In practice, we have to have the repository already, and then we can ask it for that commit, by the hash ID: if it isn't in there, we pick some other repository that does have it and use git fetch to get it from the other repository.

So, in short, we use the commits in the repository, as that's what the repository is about. We ask for them by raw hash ID; Git pulls them out. Each commit holds two things:

  • A commit holds a full snapshot of every file as of the state it had when it got frozen into that commit, as an archive. These are the files we want to use / work-with. They're in a special, read-only, Git-only format, compressed and—importantly— de-duplicated , so that the repository doesn't fatten up grotesquely as we add commit after commit. Only changed files need to be stored, and even then they can be stored very efficiently (with further tricks happening eventually, long after the initial de-duplication).

  • A commit also holds some metadata , or information about the commit itself. This includes things like the name and email address of the person who made the commit, and date-and-time stamps, and so on.


1 Some simple math, specifically the pigeonhole principle , tells us that this cannot work forever. The sheer size of the hash space is meant to put off the inevitable Git failure for billions of years, though the birthday problem or birthday paradox means it can't work even as long as that. But if it works until after we're all dead and gone, that's probably long enough. That's someone else's problem!


Commits form branches

Crucially for Git itself, Git stores the raw hash IDs for a list of previous commits in any new commit anyone ever makes, in the metadata for the new commit. The commits themselves are entirely read-only—this is required by the magic hash ID numbering system—so no part of any commit can ever change, and each commit records one—well, zero or more, but usually just one— previous commit hash ID. This forms a backwards-looking chain, where, if we just know the hash ID of the latest commit, we can have Git find all the earlier commits.

Let's call the latest commit hash ID H (for "hash"), and draw the commit:

            <-H

We draw the commit with an arrow sticking out of it, pointing backwards, to represent the stored hash ID in H . Let's call that hash ID G for short and draw commit G too:

        <-G <-H

Of course, G has another arrow sticking out of it: G 's metadata stores the raw hash ID of some earlier commit. Let's call that one F and draw it in too:

... <-F <-G <-H

The result is a chain that goes back all the way through history, until it reaches the very first commit ever. If we call that commit A , and draw it in, commit A has no backwards arrow: its list of previous commits is empty . This is kind of a special case, but it simply means "there's no previous commit", which allows Git to stop going backwards:

A--B--...--G--H

We'll get lazy (for various reasons good and bad) and stop drawing the internal arrows here as arrows. Inside Git, they point backwards—and it's important to remember that now and then—but humans really like to go forwards a lot, so a diagram like this is OK, and easier to draw (ie, I'm a little lazy ). The point here is to note that all we need is to somehow (magically?) find commit H . From there, Git can find every earlier commit .

Branch names find branches

We could memorize the actual latest hash ID. We'd have to memorize a new, big ugly hash ID every time we make a new commit, though. We could write them down on paper, and carefully type them in to Git commands, but that seems stupid... and, wait! We have a computer! Let's make it remember these hash IDs for us. What if we dropped the latest hash ID into a file.., no, better, a database of branch names?

And that's exactly what we do: a repository has, besides the database of commit objects and other internal Git objects that's indexed by raw hash IDs, a database—usually much smaller—containing branch and tag and other such names. Each name holds just one hash ID, but that's all it takes:

...--G--H   <-- master

The name master literally holds the hash ID for commit H . So now we can just run:

git log master

and view commits from H on backwards, as Git follows H back to G , then to F , and so on, backwards.

We can, at any time , tell Git to add new names . We can also tell Git to delete names. The names simply point to a commit: there is one arrow coming out of each name, pointing to any commit we like.

If we create a new branch name feature right now, we have to pick one of our commits, to make the name point to that commit. We can pick H , or G , or anything earlier. But let's pick H —the latest commit—because why would we pick an earlier one? We get this:

...--G--H   <-- feature, master

Both names point to commit H . That's perfectly fine. But that leaves us with a couple of questions:

  • How do we know which name we're actually using? To answer that, Git attaches a special name, HEAD , written in all uppercase like this, to one and only one branch name.

  • Which branch are these commits on?

The second question is a trick question, because these commits are actually on both branches . What it means, in Git, for a commit to be "on" some branch, is that we can find that commit by starting with the commit to which the branch name points and working backwards.

So, with:

...--F--G--H   <-- feature (HEAD), master

we know that we're "on" branch feature , and that this branch contains commits up through and including H . Branch master also contains the same commits.

Making new commits drags the current branch name forward

If we now make a new commit, in the usual way—we'll skip right over what that is since you already know it—this new commit will get a new unique hash ID. Let's call that hash ID I . Commit I will point back to existing commit H , like this:

...--G--H
         \
          I

I drew new commit I on a new line, because Git's last trick, as it creates new commit I and thereby obtains the new unique hash ID, is to store that ID into the current branch name . The current branch name is the one that has HEAD attached to it. The other branch names—no matter how many there are—don't change at all during this update, so now we have:

...--G--H   <-- master
         \
          I   <-- feature (HEAD)

Putting some pieces together

Let's go back to your original statement:

I have a master branch and a feature branch. I've created a third hot-fix branch, made some changes, and merged it to master . Afterwards, I also merged it to feature , because I am doing a lot of work there and I want the bugs fixed there as well.

Each of these names must, necessarily, point to one single commit . Which one? I have no idea: that's in your repository, which I don't have. I'll make something up, with the caveat that my made-up example obviously won't match your own repository exactly. Here's my made-up example:

          E  <-- release/1.0
         /
...--C--D--F--G   <-- master
               \
                H   <-- feature (HEAD)

Now, to fix the bug, which was in commit D , we've checked out commit D and attached a branch name there, hot-fix :

          E  <-- release/1.0
         /
...--C--D   <-- hot-fix (HEAD)
         \
          F--G   <-- master
              \
               H   <-- feature

and then we fix the bug, making a new commit I :

          E  <-- release/1.0
         /
...--C--D--I   <-- hot-fix (HEAD)
         \
          F--G   <-- master
              \
               H   <-- feature

We can now merge commit I , the hot-fix for the bug in D , into any of these various branches. (That's the advantage to going back in time and fixing the bug as soon as it got introduced: it's always possible to "forward merge" the fix. Not everyone uses this method, as it brings a certain amount of pain as well, but without any better guide to drawing your situation, it's what I'm drawing.)

Merging in Git

Note: I'm moving fast here, skipping over a lot of helpful introductory material. For instance, I've skipped how Git transforms a commit, which is a snapshot, into a diff or patch or changeset, which isn't. For a gentler introduction, consider a longer tutorial, or one of my other answers.

Merging in Git is about combining changes . Sometimes there's nothing to combine, which makes things easier, but I've carefully made sure that there's always something to combine in these examples, so that we'll go through only the most general case.

Ignoring the other diagrams so far, let's look just at this particular diagram:

          I--J   <-- br1
         /
...--G--H
         \
          K--L   <-- br2

We'll pick one of the two branches and use git switch or git checkout to make it the current branch and hence make J or L the current commit . For simplicity (and because the merge result is symmetric in most practical cases anyway) let's just pick br1 . Then we'll run git merge and give it the other branch name, which Git will use to find the other of the two commits—ie, commit L .

To perform the actual work of merging, Git must now find the common starting point . Git does this on its own, using the graph implied by the parent links. That is, Git works backwards from the current commit J to find I and H and so on. At the same time, Git works backwards from the other commit we named: L leads back to K , which leads back to H .

It's obvious from this drawing that the first shared commit, on both branches , is commit H . (Or is that the last shared commit? Depends on whether you think backwards, like Git does, doesn't it?) Anyway, skipping over lots more potential complexities that don't occur in carefully chosen examples like this one, commit H is our merge base , or common starting point here, for git merge .

To combine work , what Git does now is to run git diff on (ie, compare the snapshots in) commits H and J , to see what we did on branch br1 , and then, separately, diff commits H and L , to see what they (whoever "they" are) did on branch br2 . For each file, this gives us "what changed", and the changes apply to the same starting point because there's just the one copy of that file in H , and the one copy in each of the two targets. That makes it easy for Git to combine the two sets of changes. Each change is either adding or deleting some lines, because that's what comes out of git diff internally.

Having combined the changes—taken the "ours" (H-to-J) changes and the "theirs" (H-to-L) changes—for any one particular file, Git then applies both changes to the copy of the file from the base commit, H . Where (if) two changes overlap , Git requires that they be exactly the same change, otherwise Git declares a merge conflict and forces you—the programmer—to fix things up. (This description omits a number of special cases, but covers the majority of what Git does for you.)

If Git is able to combine all the changes to all the various changed files, and apply those combined changes to all the files from the merge base commit, Git will go on to make a new commit. This new commit is special in exactly one way: instead of listing one previous commit, it lists two . That's all, Like every other commit: this new commit has a snapshot and metadata: the snapshot stores all the files , in their merged form, and the metadata shows that you made this commit, just now. The only thing special about this new commit is that it links back to both commits J and L , like this:

          I--J
         /    \
...--G--H      M
         \    /
          K--L

As usual, Git writes the new commit's hash ID in the current branch name . Since we said we made br1 the current branch, that means we finish off our drawing this way:

          I--J
         /    \
...--G--H      M   <-- br1 (HEAD)
         \    /
          K--L   <-- br2

Note how, at this point, the commits "on" br1 include commit M , and then—because M points back to both J and L —commits J and L , and I and K , and then H and G and so forth. But when we use the name br2 , the commits that are "on" this branch start at L , and go back to K , then H , then G and so on. Git literally can't go forwards: commit H does not connect to either I or K . All the connections are strictly backwards.

So commit M is not on br2 , it's only on br1 . The same goes for IJ . But merging like this has caused commits KL , which were only on br2 before, to be on both branches now.

This is in fact the answer to your question, but to finish things off, let's look at the rest of the details. There are often a bunch of complications in the way of seeing it.

Ahead and/or behind

Given any two branch names, Git can count how many commits one name is "ahead of" the other, and how many it is "behind", simply by counting the commits that we can only "see" on one of the two names:

...--F   <-- br1
      \
       G--H   <-- br2

Here, branch br2 is "two commits ahead of" br1 because commits up through F are on both branches but GH are only on br2 . This in turn means that br1 is two commits behind br2 . When we have a "forking" or "branching" structure like this one:

          I--J   <-- br1
         /
...--G--H
         \
          K   <-- br2

we find that br1 is "ahead 2" and "behind 1" of br2 , and br2 is "ahead 1 and behind 2" of br1 . These are precisely commits I -J (the two) and K` (the one).

Merging causes commits that weren't "on" some branch to be "on" the branch. That branch is now "less behind", maybe not at all behind. But making a new merge commit causes the branch that just acquired the new commit to be "ahead".

Note that if you draw the graph , a lot of these things get clearer. You can do this by hand, or have Git and/or other programs do it. See also Pretty Git branch graphs .

More questions

This is the whole answer to your question as asked, but there are a few more questions we should ask. Let's look now at these.

You have a repository, local to you, where you do your work. Let's call this "on your laptop" (whether or not your computer is a laptop computer) as this gives us a good name for it.

Meanwhile, GitLab has a different repository , stored on their server. This other repository has its own databases . The commits that are in their objects database, and the commits that are in your laptop Git's objects database, get shared: their unique hash IDs tell them apart, and when you and they have the same commit , you and they have a database entry whose key is that particular shared hash ID and the contents match. (This database is a simple key-value store , with hash IDs as the keys.) But—this is importantthe branch names are not shared . They have their names, and you have yours; each of their branch names holds one hash ID, and each of your branch names holds one hash ID.

When you ran git clone to make your laptop repository, your Git read, from their Git, all the commits and stuffed them into your new objects database, so that you had copies of all their commits. But your Git took each of their branch names, like master , and changed them into remote-tracking names like origin/master . Your Git software will copy their branch names into your repository by making this change, turning branch names into remote-tracking names. This keeps your branch names yours .

The git fetch command has your Git reach out to their software, using the URL stored under the name origin . Your Git has their Git list out their branch names and commit hash IDs. Your Git can immediately tell if you have all their commits or not, just by checking those hash IDs. For any commits you are missing, your Git will ask their software to package up those commits and send them over, and it will do that. Those commits come with the parent commits, up until the point where your Git already has the commit(s) needed, so that you get all their new commits that they have that you don't. Then your Git will update all your remote-tracking names to remember their new most-recent-commits for each branch (what Git calls a tip commit , whether it's a branch tip or a remote-tracking-name tip).

If you run:

git status

you will see an "ahead" and/or "behind" count for the upstream of your (local) branch, which is typically the corresponding remote-tracking name in their repository. So you run:

git fetch

(or git fetch origin ) to update your repository with any new commits they have that you don't, and update all your origin/* names to remember the tip commits of their branches, and then git status tells you about your branch foo with respect to their origin/foo , assuming the upstream of your foo is origin/foo .

To send commits to them you use git push . This is as close as Git gets to the opposite of git fetch , but it has some key differences. As before, your Git calls up their software at the stored URL. You (or your Git) then offer to them a commit hash ID: usually, the tip commit of one of your branches. If they don't have that commit yet, your Git software has to offer the commit's parent(s), just like for git fetch , until you and they find the shared commits that you don't need to re-send; you then send all these commits. But then, instead of having them set a "remote-tracking name" in their names database, your Git will ask—politely, by default—that they change one of their branch names to remember this as their new tip commit for that branch.

That is, you don't say to them: "Here's some new commits, remember them under grant/master " but rather: "Here's some new commits, now please set your own master` to remember the tip commit as its latest tip commit." In other words, you'd like them to add commits to their branch .

GitHub, GitLab, and other hosting providers all add various protection features so that not anybody can add commits—base Git has no permissions checking!—but assuming you have permission and you're just adding new commits , they generally do as you ask. In any case, whether or not they permit this update, they send back a response: OK, I set my branch or no, because ____ (fill in the blank). The OK response makes your Git update your remote-tracking name, because they accepted the update. The "no" makes your Git produce a "rejected" error, using the filled-in-blank.

The most common error (besides the permission-denied type errors that hosting sites add) is "non-fast-forward", which is Git's peculiar way of saying: I can only remember one hash ID in a name, and if I make this change, some commits that I can access right now via that name, will no longer be find-able via that name.

Since you'll generally view ahead/behind through the lens of "these pushes worked and updated my remote-tracking names" or "I just ran git fetch and updated my remote-tracking names", what you see here is often a single or double level reflection of what you've done in your own repository. This makes for a bit of confusion sometimes. But if we take the repository I drew early on above:

          E  <-- release/1.0
         /
...--C--D--I   <-- hot-fix (HEAD)
         \
          F--G   <-- master
              \
               H   <-- feature

and then add merge commits as appropriate:

git switch master && git merge hot-fix

          E  <-- release/1.0
         /
...--C--D--I___  <-- hot-fix
         \     \
          F--G--J   <-- master (HEAD)
              \
               H   <-- feature

(new commit J on master ) and:

git switch feature && git merge hot-fix

          E  <-- release/1.0
         /
...--C--D--I______  <-- hot-fix
         \     \  \
          F--G--J  \  <-- master
              \     \
               H-----K   <-- feature (HEAD)

(the diagrams get messy, but here's new merge K on feature ) we can see that feature is now "ahead of" master because commit K isn't on master . That's true even if we just merge feature into master instead:

git switch feature && git merge master

produces:

          E  <-- release/1.0
         /
...--C--D--I___  <-- hot-fix
         \     \
          F--G--J   <-- master (HEAD)
              \  \
               H--K   <-- feature

It's the new merge on feature , commit K , that pushes us "ahead of" master every time.

(Git has a non-merge-y kind of operation that Git calls—misleadingly—a fast-forward merge , that git merge can perform, that doesn't make a new merge commit. It can only be used in particular circumstances and Git Hub in particular will never do this, no matter how much you might want them to; Git Lab may differ here. But presumably this is not what you're doing, or you would not have had this question. I skipped over it in the merge section for space reasons.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM