简体   繁体   中英

How do I revert local files to how it was before git pull

I was trying to pull a remote repo to my local file and forgot to stash my local files before the action. How can I revert this process because my local changes are gone and I've made a lot of changes?

This isn't really an answer. Unfortunately, there isn't an answer. In general, Git should not have lost your files, unless you explicitly told it to. There are a number of ways to tell Git lose my files for me though. Without seeing the exact set of commands you ran, nobody can give you anything other than general advice.


In general, a pull or merge that would lose unsaved work should fail:

$ git merge
Updating 51ebf55b93..98cedd0233
error: Your local changes to the following files would be overwritten by merge:
    Makefile
Please commit your changes or stash them before you merge.
Aborting

with git pull producing the same message on my machine running Git 2.24.0.

Some earlier Git versions will give a different error, and some truly ancient Git versions really did sometimes wreck your work-tree if you had uncommitted work. You don't mention which version of Git you have, but if it's at least Git 2.0 this should not be a problem.

Nonetheless, I recommend avoiding git pull . What git pull does is run two Git commands. The first one is simply git fetch . For instance, running git pull origin runs git fetch origin ; running git pull origin master runs git fetch origin master ; and running git pull with no options runs git fetch with no options.

It's the second command that's dangerous—or, well, dangerous is too strong a word: the second command can mess with your own work. It's always safe to run git fetch . 1 Learn how to read what git fetch prints—it's kind of confusing, but it all means things. See below for details.

The second command that git fetch runs defaults to git merge . Learn what git merge does, and how it affects your current work-tree. Learn its options: --ff-only , --no-ff , and so on.

You can choose to have git pull run git rebase as its second command. Learn what git rebase does, and how it affects not only your current work-tree, but also the commits that you rebase in the process. Learn its options, such as -i for interactive rebase. Learn that git rebase is in effect like a series of repeated git cherry-pick commands, 2 that uses Git's detached HEAD mode. 3

Then, after running git fetch and seeing what git fetch did, decide whether you want to look at the commits that came in, and whether you want a merge—with or without --ff-only or --no-ff —or whether you want a rebase. You now have the space and time to run git log if you want, and space and time to decide on rebase vs merge.

Once you're very familiar with all these elements, and are pretty sure what's happened in whatever repository you will git fetch from, then it becomes safe to pick merge vs rebase in advance and use the convenience git pull short-cut to run git fetch and then immediately run the second command too.


1 If you fiddle with internal git fetch settings, you can make this unsafe. Think of it as rewiring your light switches so that the wall near the switch has an electrified plate. If you don't go breaking your setup, git fetch will be safe. If you do, that's on you. 😀

2 One kind of rebase really does use git cherry-pick . Another uses git format-patch and git apply instead. This second kind of rebase is in most ways not as good, but does go faster, so it used to be the default—but Git 2.26 is about to switch the default to use the cherry-pick method.

3 A rebase can stop in the middle and need help from you to fix it. When that happens, Git continues to be in this detached HEAD mode. Hence, you should learn what detached HEAD mode is about.


What git fetch does

Remember that Git is really all about commits . Each commit holds a full and complete snapshot of all of your files. And, each commit has some metadata: some information about the commit, that's not part of the committed data but that Git wants or needs to retain for you as well. Each commit gets a unique hash ID: a big ugly string of letters and digits, 4 rather than a nice simple number just counting up (1, 2, 3, ...).

The real magic of a commit hash ID—and the reason it has to be so big and ugly—is that it's unique across every Git repository everywhere. Remember that not only do you have your Git repository, or maybe multiple ones, everyone else has their repositories too. There's another Git over at origin . That's the repository you cloned, when you ran:

git clone <url>

Your Git made a new, empty repository, stuck the URL you gave into it under the name origin , and then called up the other Git over at that URL. That other Git then offered, to your Git, its commits, telling your Git about its branch names (and tag names, and other names, but we'll concentrate on branch names here). Your Git had their Git give you all of their commits, and your Git put all of those commits into your database-of-all-commits. Then your Git took their branch names and copied them to make your remote-tracking names.

Their master became your origin/master . Their develop , if they had one, became your origin/develop . The pattern here is straightforward: for each branch they have, your Git creates an origin/ -prefixed variant. In this name, your Git stashes the hash ID that their Git says goes with their branch names.

In short, your remote-tracking names remember what their branch names were.

But you cloned that Git repository seconds ago. Or maybe even minutes, or hours, or (horror) days or weeks. Much has happened since then! (An active repository can gain dozens of commits every few seconds, in some cases.)

So, to pick up new stuff from them, you run:

git fetch origin

(or just git fetch , which generally defaults to using origin anyway). Your Git calls up their Git, like it did earlier. It has them list out their branch names (and tag and other names) like before. Their master may be different now! If so, they may have commits that your Git doesn't. Your Git now gets, from their Git, the new commits they have, that you don't. 5 Your Git adds these to your collection, and now you have them—and now your Git updates your remote-tracking names .

I have a clone (well, one of several) of the official Linux kernel, which I tend to update very sporadically, on a handy test machine here. Let's run git fetch on that one now and observe:

$ git fetch
remote: Enumerating objects: 243814, done.
remote: Counting objects: 100% (243814/243814), done.
remote: Compressing objects: 100% (45189/45189), done.
remote: Total 243814 (delta 198891), reused 242574 (delta 198039)
Receiving objects: 100% (243814/243814), 76.86 MiB | 4.94 MiB/s, done.
Resolving deltas: 100% (198891/198891), completed with 12626 local objects.
From git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux
   0f1a7b3fac05..5ad0ec0b8652  master     -> origin/master
 * [new tag]                   [snip lots of new tags]

We see a bunch of messages here. The first few start with remote: . These are actually messages generated by their Git, that our Git just copies through so we can see them. They talk about enumerating, counting, and compressing objects . These objects are the internal representation that Git uses for each of its database objects: commits, trees, blobs, and annotated tags. Every commit uses some number of trees and blobs to hold its files, in a frozen-for-all-time snapshot. Commits can share trees and blobs to save space: if a commit re-uses a file, or a whole tree full of files, from an earlier or later commit, they just share their trees and/or blobs.

The counting step counts how many of these objects they have, that I don't, that they are going to send. In general, they won't send ones that I do have. 6 So this counting happened for many commits and their files, and some annotated tags (the new tag stuff above), and it counted out nearly a quarter-million objects.

Then, they compressed some of those objects—in this case, about 20% of the objects (which will generally account for much more percentage of the total bulk of the network transfer), and sent them to my Git. We now get out of the remote: -prefixed lines, to where my Git reconstructed the original objects and added them to my repository.

Last, my Git updated my remote-tracking name, origin/master , based on their master . This is the really important line:

   0f1a7b3fac05..5ad0ec0b8652  master     -> origin/master

The two hash IDs here are:

  • 0f1a7b3fac05 : my origin/master held this hash ID before my git fetch updated it;
  • 5ad0ec0b8652 : my origin/master held this hash ID after my git fetch updated it.

After this, we see their branch name, master , then this ASCII arrow -> , then my remote-tracking name, origin/master .

This tells me that my origin/master has been adjusted to add new commits, without discarding any existing commits. If I want to know how many commits I just got, I can now count them:

$ git rev-list --count 0f1a7b3fac05..5ad0ec0b8652
31244

So, since the last time I ran git fetch , I've just picked up another 31,244 commits, all on their master / my origin/master .

(I also picked up a bunch of new tags, for kernel versions 5.4 and 5.5 and the upcoming 5.6.)

The short version of all this is that you can pick up the git fetch output here and cut-and-paste the two hash IDs into a git log or git rev-list --count to look at the commits, or see how many there were.

This particular repository doesn't have branches that get git push --force applied to them. The Git repository for Git itself, however, does. My copy of that one here is much more up to date so I can't show actual git fetch output, but it would look like this:

 + old...new  pu     -> origin/pu (forced update)

This output has three things that are different, besides the two hash IDs being different (and fake in this case):

  1. it has a + at the front, indicating that the update was forced;
  2. it has three dots between the two hash IDs, indicating that the update was forced; and
  3. it ends with (forced update) , indicating that the update was forced.

There is no difference between these three indicators. It's just very important that my Git let me know that this was a forced update, ie, I now have some commit(s) in my repository, left over, that they decided should be thrown out. If I'm using their pu branch—my origin/pu branch—in some way, I should be aware of this.

The pu (Proposed Update or PickUp) branch in Git is one that all Git developers, and other hangers-on, have agreed is regularly rebased, or rewound and rebuilt. Commits in this branch may be replaced with new and improved ones, or discarded entirely; we all agree that we will deal with this, not complaining about it.


4 Technically, the hash ID is just a bit or byte string. Git currently uses SHA-1 hashes, which are 160 bits long, and then represents these hash values as 40-character hexadecimal numbers. Git is gradually moving towards replacing these with SHA-2 hashes, which are 256 bits long.

5 Note that you may have added your own commits to your repository before this point. That's why Git can't use simple sequential numbers. Suppose you got 1000 commits from them, numbered 1 through 1000. Then you make two commits of your own, numbers 1001 and 1002. Now you call up their Git and they have three new commits, numbered 1001 through 1003. What do we do about these overlapping numbers?

It is actually possible to keep some local numbers. Mercurial, which is as capable as Git, does so. The local numbers are sometimes handy—but they're not as usable as one might like, in the end. Git just doesn't bother with them.

6 This generality makes a lot of assumptions. Git trades off being totally precise vs doing this with minimal information and doing it quickly. If your Git and their Git had a longer conversation, which would take more network bandwidth, their Git could sometimes omit some objects, which would take less network bandwidth.


Why you need a second Git command

When git fetch obtains new commits, they just sit there in your repository database. You can't see any of these directly. You can run git log on them, or git show . You can use the commit hash IDs that you find with your remote-tracking names. But in general, you'll have to do something with those commits to get the files.

If you make your own commits, the something that you need to do now is usually some form of incorporate their work into my work . The two main Git commands to do this are git merge , which merges their work into yours, and git rebase , which copies your original commits to new-and-improved commits, with the improvement—at least, you hope it's an improvement—being to put those commits after their new commits, instead of alongside theirs.

Both of these commands affect your branches.

Fetch does not: git fetch gets you new commits but your existing commits are totally unaffected by adding new ones. The new ones have different big ugly hash IDs. Every new commit has a different hash ID than every other commit. Commits don't have small numbers; nobody will make commit #3, as commit #3 doesn't ever exist. All git fetch does is add new commits and update your remote-tracking names.

But both git merge and git rebase need to use your work-tree . Your work-tree or working tree (Git mostly uses the longer term now) is where Git extracts the commit snapshots, so that you can see them and work on / with them—hence the name work-tree: it's where you do your work. The merge operation often needs to change your work-tree around. The rebase operation runs repeated git cherry-picks , each of which often needs to change your work-tree around. Both git merge and git cherry-pick can have merge conflicts , which you have to resolve: these mean Git couldn't figure it out on its own. In this case, Git leaves you a mess in your work-tree, that you must fix.

In both cases, the final result of a merge or rebase can change which commit is the last commit in your branch. They always work with your current branch. So these two commands need a lot of careful use, and it's always wise to run git status first, to make sure your current branch is (a) the right branch and (b) ready for a merge or rebase operation.

Feel free to use git pull , just know that it will (1) run git fetch , then (2) run one of these other commands. You won't get a chance to look at, and make decisions based on, the git fetch output: you've committed in advance to an immediate git merge or git rebase . 7 There is nothing fundamentally wrong with this, but I prefer to keep these two operations separate.

(I have an alias, git mff , that runs git merge --ff-only , which is what I often use after git fetch . That, combined with a brief glance at the fetch output and the current status, is usually what I want, and if it doesn't work, then I want to look more closely at everything.)


7 There are some unusual circumstances where the second part of git pull is neither merge nor rebase. For instance, git pull into a totally empty repository requires doing a git checkout instead, so it does that. In ancient Git (1.5.something or 1.6.something), this lost me a day or two of work, once, because it was not coded carefully and I hadn't committed yet. That's one of the reasons I avoid git pull : it's burned me more than once, with that particular case being the worst one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM