简体   繁体   中英

Put several commits while detached head into a new branch

While working on my master branch I made some buggy changes which basically ruined it:

$ git log --oneline
9de8556 (master) This is incomprehensibly buggy
b627524 More bad choices 
8239459 Possibly bad changes 
0d86609 (github/master) This is a commit already on github

So I checked out 0d85609 without creating a new branch and kept working on that commit, which produced a few commits while in detached head state:

$ git log --oneline
8aef120 (HEAD) some stable version 
4a78d72 more hacking ... 
e2ff002 hacking ...
e5e03c5 making progess...
ae7a717 more cautious changes 
b140766 cautious changes 
0d86609 (github/master) This is a commit already on github

None of this is on Github yet. I would like to create a branch that includes the history of commits made in the detached head state (not just the most recent one), completely remove the buggy commits from my master branch, and then merge my new branch back into master. Alternatively, I be happy to completely remove the buggy commits and advance the master branch along the recent commits.

Show can I achieve this? The should look like master is on 8aef020 (or some merge thereof) as if nothing has happened.

Bonus question: I have come to understand that commits in detached head are not part of a branch. But does Git still keep track of their commit history relations? Keeping that history is the specific reason I ask this question and that has not been apparent from previous questions or other resources.

TL;DR

git branch -f master HEAD       # the name HEAD here is actually redundant
git checkout master             # or `git switch master` in Git 2.23 or later

or:

git checkout -B master          # see long explanation below

Long

We should actually start with this:

Bonus question: I have come to understand that commits in detached head are not part of a branch. But does Git still keep track of their commit history relations?

Yes. And in fact, whether these commits are "part of a branch" depends on how you define the word branch. It's accurate enough to say that these are not on any branch, but it's equally accurate to say that they are a branch. For more about this conundrum, see What exactly do we mean by "branch"? Git's little trick here is that branch names don't really mean anything. Git is really all about commits . Everything is in the commits themselves.

First, remember that all commits are numbered, by those hash IDs like 8aef120 (which itself is an abbreviation). These look random, but aren't: each is actually a cryptographic checksum of the contents of that particular commit, so that any Git anywhere will agree that that commit gets that ID. They're as big and ugly as they are so that every commit gets a new, unique ID.

All commits—well, all Git objects, really; commits are one of four types of object—are stored in a big database. A Git repository primarily is this big database, of commit and other objects, which Git can look up by their numbers. Without getting into details, 1 let's just say that a commit consists of two parts:

  • The main data of a commit consists of a complete snapshot of every file that Git knew about, at the time whoever made the commit, made it. For space reasons in this answer we won't go into any detail at all about how this works.

  • Each commit also has metadata . Some of it shows who made the commit, when, and why (the log message), but one part gives the raw hash ID of the commit that comes before this particular commit.

What this second part means is that each commit works as a pointer to the previous commit. (We'll skip right over merge commits , which have more than one such pointer, for now.) Hence, if we like, we can draw a row of commits like this, using uppercase letters to stand in for the random-looking hash ID numbers:

... <-F <-G <-H

Here H is the hash ID of the last commit in the chain. Inside H , Git has stored the hash ID of earlier commit G . Of course, inside G , Git has stored the hash ID of earlier commit F , and so on. So each commit "points back" to what Git calls its parent . If we can somehow have Git find commit H , Git can use that to find earlier commit G , and so on, backwards.

So, now the question is: where will we store the big ugly hash ID H ? We could jot it down on scrap paper, or on a whiteboard, or something, but hey: we have a computer . Let's have the computer remember it for us. Let's give it a name that means something to us, like master or develop , too!

This is precisely what a branch name does for us: each branch name holds the hash ID of the last commit in the chain. So we should draw this as:

...--F--G--H   <-- master

We can turn the arrows from commit to commit into a straight line, because once we make a commit, nothing in that commit—including the parent hash ID—can ever change. So these are fixed forever; we just need to remember that it's easy to go from child to parent, but very difficult to do the opposite. Children remember their parents, but parents can't remember their children. But branch names change their arrows. If we have Git add a new commit while we're "on" master , this new commit I will point back to existing earlier commit H :

...--F--G--H
            \
             I

and now Git will write the new hash ID for I into the name master :

...--F--G--H
            \
             I   <-- master

(and there's no need to use an extra row for this drawing any more, but the extra row is a limitation of text drawings in the first place).


1 Details can be important. Here is an actual example commit (with @ s munged as a spam deterrent), as shown by the mainly-internal-usage git cat-file command:

$ git cat-file -p HEAD | sed 's/@/ /'
tree f82fc14c164819b1f3685098896fa8533809175c
parent 5a0482662f076ca7e1f27ef2848feec1763583d1
author Junio C Hamano <gitster pobox.com> 1597878870 -0700
committer Junio C Hamano <gitster pobox.com> 1597878893 -0700

Ninth batch

Signed-off-by: Junio C Hamano <gitster pobox.com>

Note that the actual hash ID of this commit is:

$ git rev-parse HEAD
675a4aaf3b226c0089108221b96559e0baae5de9

but note how I was able to just use the word HEAD above, without typing in the raw hash ID.


Attached and detached HEAD modes

When you have more than one branch name , like this:

          I   <-- feature
         /
...--G--H   <-- develop, master

Git needs a way to know which branch name you are using . If you use git checkout master or git switch develop (Git 2.23 or later), either way you are using existing commit H , but when you make a new commit, Git needs to know which name to update. If you use git switch feature (or, pre-2.23 or if you prefer the old way, git checkout feature ), you'll be using commit I now, and new commits should update the name feature .

To make this work, we just attach the special name HEAD to exactly one branch name:

          I   <-- feature
         /
...--G--H   <-- develop (HEAD), master

It's now clear that we are using commit H , and that Git will update develop to point to a new commit once we make it.

In detached HEAD mode, Git simply has the special name itself point directly to a commit:

...--G--H   <-- master, HEAD

If we now make new commits, the name HEAD moves, instead of the name-to-which- HEAD -is-attached moving:

...--G--H   <-- master
         \
          I--J   <-- HEAD

Git now finds commit J using the name HEAD , finds commit I by working one step back from HEAD , finds commit H through either working back two steps or using the name master , and so on.

To exit detached HEAD mode, simply give the name of a branch to git checkout or git switch . Git will switch to the selected commit and re-attach the name HEAD to that commit. Any commits you made in the detached HEAD mode, however, have now become difficult to find.

Names that aren't branch names

Besides branch names like master , Git has a whole slew of additional names. The two that you should be (or become) familiar with are tag names and what I now call remote-tracking names . Git calls the latter remote-tracking branch names , but I have found that the word branch here is distracting and of negative value, so I just omit it.

The function of these names is almost identical to that of branch names. The key difference—the part that makes them not branch names—is that if you give one to git checkout , Git will put you in detached HEAD mode. That is, the names select one commit—just like branch names—and you can give them to git checkout , just like branch names. But Git won't attach HEAD to them. It will just go into detached HEAD mode instead. (The new git switch command requires that you add --detach before it will do this, instead of just detaching your HEAD . That's nicer to new Git users.)

Remote-tracking names

Your github/master is presumably a remote-tracking name. These names have a different special property. Every Git repository has both an object database (which stores the commits and other internal objects) and a name-to-hash-ID "database". 2 This means each Git repository has its own branch names . Your Git's branch names are yours . You can change them whenever you like. That means the GitHub repository, which is a separate Git, 3 has its own branch names, too.

Your Git will remember their branch names, and corresponding hash IDs, using your remote-tracking names. That is, when you connect your Git to their Git, your Git sees that GitHub's master is a1234567 or whatever it is. Your Git then makes sure you have the commit too, and once you do, creates or updates your github/master to point to a1234567 , so that you can see that they call a1234567 master .


2 I put this in quotes because, as of today, the state of this name-to-ID-mapping is pretty primitive. There is ongoing work in Git to put in a real database, but it will be some number of Git releases before this is ready for public use.

3 Although this other Git repository is separate, it shares commits with your Git. Their hash IDs and yours match thanks to the magic of cryptographic checksums. To see if you both have the same commits, your Git and their Git simply compare the hash IDs.


Summary so far: what you know by now

  • Git stores commits .
  • Commits are numbered by hash ID.
  • Names store one number: the hash ID.
  • Branch names have a special property: the attachment ability.
  • HEAD is normally attached to a branch name (so that HEAD remembers the name that gets updated by git commit , and the name remembers the hash ID), but can be detached (so that HEAD remembers the hash ID and gets updated directly).
  • git checkout branch-name results in an attached HEAD , but git checkout anything-else results in a detached HEAD .
  • If we've made commits in detached HEAD mode and want to remember the last one, we'll need to set up a name for it.

Creating, destroying, and moving branch names

Git can easily create a new branch name at any time—well, almost any time—and can destroy any branch name except the one HEAD is attached to (assuming HEAD is attached). The only requirement for creating a new branch name is that it must contain the hash ID of some valid commit that exists in this Git repository right now.

Hence, if we're on master like this:

...--G--H   <-- master (HEAD)

we can just have Git create a new name that also points to existing commit H . The git branch command has this as its default action:

git branch newbranch

results in:

...--G--H   <-- master (HEAD), newbranch

With -d , git branch can delete one of these names. If we just made newbranch like this, it's safe to delete it, because master still finds commit H , but it would be unwise to delete the name feature if we had:

          I   <-- feature
         /
...--G--H   <-- master (HEAD)

because once we delete that name, how will we find the actual (random-looking) hash ID I ? We—and Git—can only go backwards , from H to G . 4

Anyway, git branch defaults to making the new branch point to the current commit , which Git finds using the name HEAD . But we can just give it the name of some other branch, or the raw hash ID of a commit. So we can create any branch name we like, pointing to any existing commit.

With the -f argument, we can move any branch name we like, to any existing commit, with–again—that funny little restriction: we can't move the current branch name, to which HEAD is attached, using git branch -f . (We can move it with git reset instead; the reason why is too long to go into here.) In detached-HEAD mode, all branch names can be moved, of course.


4 In fact, if we have:

          I   <-- feature
         /
...--G--H   <-- master (HEAD)
         \
          J   <-- newbranch

there isn't a commit after H : there are two . We'd have to pick a direction. There is a way to do that: we pick the last commit, in the direction we want to go, and then work backwards from there until we arrive at H . The commit we used to get to H is the next commit "in that direction". But we need that start-from-the-end-and-work-backwards thing, which means we need an end , and that's what a branch name is. It gives us the last commit in that branch.


Now you know what we're doing in the first example

We start with a pictorial diagram like this:

          I--J--K   <-- master
         /
...--G--H   <-- github/master
         \
          L--M--N--O--P--Q   <-- HEAD

where Q is your commit 8aef120 and K is your commit 9de8556 .

We then run:

git branch -f master HEAD

which tells Git: Make the name master point to commit Q . We forcibly move that name to point to the current commit, resulting in:

          I--J--K   [abandoned]
         /
...--G--H   <-- github/master
         \
          L--M--N--O--P--Q   <-- master, HEAD

Then we just run git checkout master to re-attach HEAD to get:

          I--J--K   [abandoned]
         /
...--G--H   <-- github/master
         \
          L--M--N--O--P--Q   <-- master (HEAD)

and we're done.

Using git checkout -B or git switch -C

Both the old git checkout and the new-since-2.23 git switch have a flag, -b and -c respectively, to tell them: Do the checkout / attaching-head thing while creating a new branch name. As with git branch , you can give these the raw hash ID of some existing commit, or any name or expression that Git can resolve into a raw hash ID. So:

git checkout -b somebranch github/somebranch

is one way to create a branch name from a remote-tracking name . Git has a built in shortcut for this sort of thing, and all of these have a special feature: if the git checkout fails , 5 they don't create the new branch name after all. Compare that to git branch somebranch github/somebranch and then a subsequent git checkout somebranch that fails: with the two-commands method, you're left with your own new somebranch name, so you might have to delete it, if that was a mistake after all.

In our case, though, we don't want to create a new branch name. The uppercase variants— -B for git checkout and -C for git switch —do the same thing as the git branch -f option, with another special feature: they won't modify the existing branch name unless the checkout succeeds. So:

git checkout -B master

tells Git: Find the commit identified by HEAD (since we didn't name a specific commit, it defaults to using HEAD ). Then, try to switch to that commit. (Of course, we're already on that commit, so there's nothing to switch: this step always works.) Last, if that worked, make the name master point to that commit, and re-attach HEAD to the name master.


5 There are a lot of reasons for git checkout to fail. Most of them are unlikely; the most common reason for failure is when you have uncommitted work that would be overwritten by such a checkout.

Branches are just pointers to revisons that can move around. Hope that helps you a little bit in your almost-there understanding. So, yes... revisions have the whole thing.

So.... you want the local branch to point to your current revision.... and forget about what you did in master past github/master .... get ready... it's like 50 operacions in a row:

git branch -f master

... and that's it. You moved the pointer. Oh, sure!

git checkout master

Now you got it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM