Git: Fix repo with a branch and its remote on different branches

Question

I have no idea how this happened, but I have a git repo in the following kind of state:

A--B--C--D  master
    \
     E--F   origin/master

The commits E and F are at this point obsolete; the changes they make on B have no relevance and they can be deleted if necessary.

I only keep the origin remote as a backup for my code, and nobody else has cloned it, so I can use whatever brute force I need.

What would be the cleanest way to clear this up?

Answer 1

I only keep the origin remote as a backup for my code, and nobody else has cloned it, so I can use whatever brute force I need.

In that case, git push --force origin master will suffice. Be very sure you're OK with origin/master in your repository, and master in the (separate) Git repository over on origin , being no-longer-able to find commit F .

(Your own repository will remember both commits E and F for some time, under the reflog name origin/master@{ number } , so as long as you keep your own repository intact, you can still get them back if they turn out not to be obsolete after all. However, at some point they will really be removed. The server may not have reflogs at all and its copy of commits E and F may go away relatively quickly—in seconds, or hours, or just a few days, for instance.)

How and why this works

Every Git repository has its own branch names, which are semi-private to that particular Git repository. There is a repository over on the machine you're calling origin and it has a branch name master .

Your remote-tracking names, ¹ such as origin/master , are your Git's method of remembering what some other Git has as its branch names. In this case your origin/master is your Git's memory of origin 's master .

When you have your Git call up their Git, via git fetch or git push , the two Gits have a startup conversation. It's slightly different for fetch and push, but you can observe what git fetch will itself observer using git ls-remote . Just run git ls-remote origin (or leave origin out as the default is usually origin anyway): your Git will spill out the other Git's branch, tag, and other names that it listed. That's the first part of a git fetch : your Git phones up their Git and gets this listing.

When you use git fetch , your Git uses this listing to ask their Git for any commits and other internal Git objects that they have, that you don't, that your Git would like to have. For instance, if you didn't have commit F yet, and your origin/master remembered commit B or E instead, and they have commit F as their master , your Git will say: I'd like to have commit F please and they'd add that to a list of commits to send. They'd then offer commit E —a Git must offer the parents of each commit—and your Git would either say no thanks, I have that already or yes please (in which case they would offer B , E 's parent, and so on).

Their fetch process then bundles up whatever is needed to get those commits to your Git—this is where you'll see counting objects and compressing objects —and sends that over. Your Git expands those so that you have a proper set of commits and their data, and then creates or updates your origin/master based on the hash ID stored in their master : now that you definitely have commit F , your Git can make your origin/master point to it.

Hence, if you run:

git fetch origin

at any time—this is the "get me everything" form of the command—your Git will call up their Git and get the preliminary "everything" listing. The two Gits will figure out from there what your Git needs to get, and give it to your Git. Your Git will then create or update all your remote-tracking names based on all of their branch names.

When you use git push , which is as close as Git gets to the opposite of fetch —the pair are push and fetch rather than push and pull due to a historical mistake ² —a similar process occurs, but there are two key differences:

Your Git sends, rather than receives. (Your Git still chooses what to send, and the "must offer parent if sending commit" still holds, and so on, but with git fetch , their Git sent and your Git received.)
At the end, instead of some Git updating some sort of "remote-tracking name", your Git asks (politely) or commands (forcefully) their Git to set some of its branch names. ³

If you ask or command them to set their master , and they obey, your Git now knows that their master is set to the hash ID you provided, so your Git now updates your origin/master remote-tracking name.

When you use a gentle, ask-type git push origin master , your Git sends commits you have that they don't—such as C and D —and then asks them politely: Please, if it's OK, set your master to point to commit D ? They will say No! If I do that, I'll forget commit F ! This error comes back as the Git jargon words rejected and non-fast-forward , but it just means they said no, as that would lose commit F and hence E as well.

But that's precisely what you want them to do. So you just need to send them a forceful command: Set your master ! They could still refuse—if you don't have permission, for instance—but of course you do have permission so in this case they will obey, and lose commit F .

There are two force modes for git push : git push --force is the ancient version, and git push --force-with-lease is a more-modern (since Git 1.9 or so) variant. The difference between these is that --force just says set , but --force-with-lease says: I think your master is _____. If so, set it to _____ and let me know that you did; if not, tell me I was wrong and don't set your master after all. The first blank gets filled in with your Git's origin/master value: your Git's memory of their Git's master . So that gives you the ability to be sure that you're having them throw away exactly what you think they will throw away.

If you're the only user of the other Git, there's no need for the care that --force-with-lease adds: what you think is right, is right, by definition. But if you're concerned that you might be forgetful (eg, if you use two different laptops and can't remember which one was up to date), you can use the fancier check-first-then-force --force-with-lease . It's never wrong to use --force-with-lease , and except for the fact that the old --force is the historical default, --force-with-lease really should be the default.

¹ Git calls these remote-tracking branch names but I decided some years ago that the word branch in here is useless clutter that creates some extra confusion, so I now just call them remote-tracking names .

² Often, after fetching commits, you'd like to integrate those commits. That's what git pull does: it runs git fetch to get commits, then runs a second Git command to integrate those commits. This seemed, early on, like the correct natural granularity, so git pull was the user-facing command that was the opposite of git push .

It turns out, though, that it's often important to separate the fetch operation from the integrate operation, inserting some sort of inspect operation in between, for instance, so as to choose the correct form of integration. Mercurial got this right and Git got it wrong: in Mercurial, hg pull means what Git means by git fetch and you have to add a separate operation, or an extra flag, to hg pull to make it also do an integration step.

³ You can choose what names you ask them to set, so you can make your Git ask or command tag names to be set, or some other kind of name, but in general, the usual case here is "branch name".

Git: Fix repo with a branch and its remote on different branches

Question

1 answers

solution1
1 ACCPTED 2020-02-17 19:47:18

How and why this works

Git: Fix repo with a branch and its remote on different branches

Question

1 answers

solution1 1 ACCPTED 2020-02-17 19:47:18

How and why this works

solution1
1 ACCPTED 2020-02-17 19:47:18