简体   繁体   中英

How to push a local repository to a branch of another repository on Github

Is it possible to add a project that is only stored on my local machine to a branch of an existing project on Github.

For example add:

local-project

to the Github repo:

https://github.com/myusername/my-project/local-branch.git

Can this be done or should the local repo first be pushed to it's own Github repo?

Additional info: Here is some extra info as per the comments I have received so far. So basically I have a Github repo (my-project) that contains a Next.js project that is currently in production. I would like to update this project to a completely different language (REACT NATIVE), that is currently still in development. The React Native version I have been working on is only locally on my computer, as I cloned a repo and haven't pushed it to Github yet. My ultimate goal is to have the React Native version as a branch (local-branch) in the (my-project) Github repo. Then when the local-branch is ready for production I can simply merge it to main.

First of all: you can push a branch, not repository.

Now:

git remote add remote-repo <remote-repo-url>
git push remote-repo local-branch:<remote-new-branch-name>

First to clarify some terminology, you can't "push a local repository". Instead, you can push a branch from your local repo to a remote repo. First you need to create a remote:

git remote add <name> <url>

For example

git remote add my-remote https://github.com/myusername/my-project/local-branch.git

By default, git will create a remote named origin when you git clone another repo.

Once you have a remote, you can push your branch:

git push my-remote my-branch

While you can do this with any branch and any repo, it usually only makes sense if the two repositories have a common history, such as when you clone a github repo locally or if you create a new repo locally and want to push a branch to an new, empty repo on github.

You need to start with clear definitions. In particular:

  • A Git repository is, at its heart, a collection of commits—a database of what Git calls commit objects , plus supporting objects that make the commits actually useful (make them contain files)—along with a secondary database to help you and Git find the commits.

  • This in turn means that Git is all about commits . It's not about files! Commits contain files, which is why we use commits. But Git isn't all that concerned with the files, at least when taking a high level view. And, it's not really about branches either, but we organize and—most importantly— find the commits via branch names, which is how branches enter the picture.

Git doesn't really have a concept of a project . That concept is up to you to define. Git has the concept of a repository , and the repository is made up of commits and other objects. So if you have an existing repository, you have an existing pile of commits (along with their other objects).

In your update, you say:

I would like to update this project to a completely different language (REACT NATIVE), that is currently still in development.

You're still talking about a "project". Git doesn't have those, so you must map your own concept of "project" onto the things Git has: repositories that contain commits.

The React Native version I have been working on is only locally on my computer, as I cloned a repo[sitory]...

Here, you mention another—apparently differentrepository . This is something Git understands. You now seem to have at least three repositories:

  • one on GitHub;
  • a different one you cloned from somewhere (somewhere else on GitHub?); and
  • a third one that's closely related to the second one.

and haven't pushed it to Github yet.

The pronoun "it" here seems to refer to your third repository. But you do not— can not—push a repository . You push some set of commits (and their supporting objects), which exist within a repository, and which you and Git find using branch names and other names.

It's now time to talk a bit more about repositories, commits, and clones .

The repository as a pair of databases

Databases take many forms, but the two that make up the bulk of any Git repository are simple key-value stores . A key-value database (see more at the Wikipedia link) takes, as its input, a key , and uses that to retrieve a value .

For a commit object, the key is the commit's name. The "name" of a commit is a big, ugly, random-looking (but not actually random) hash ID such as 4af7188bc97f70277d0f10d56d5373022b1fa385 . These hash IDs are particularly magical, because:

  • they are unique: no two different commits ever have the same hash ID;
  • every Git system everywhere in the universe computes them the same way, so that even if you haven't made a commit yet, once you make it and it gets some hash ID, every Git system in the universe will agree that, yep, that commit you just made, just now, deserves and gets that hash ID and no other commit can ever have it now that you've taken it.

This deep magic is related in various ways to cryptography, and the way Git [ab]uses it literally can't work forever. The sheer size of the hash ID space is meant to make sure it works long enough that nobody cares. But all you really need to know is that the hash ID—which Git formally calls an object ID or OID—is the "true name" of the commit. Git literally needs this hash ID to find the commit in its big database of all of its objects.

If this were the only database Git had, we'd all have to memorize these random-looking hash IDs. This would be very bad, because humans are bad at these things. So Git has a second database. This second database is also a simple key-value store, but the keys in it are things like branch and tag names, which are human-readable and make sense to humans: names like main or master , develop , and so on. Git will store the hash ID of the latest commit in some branch under the branch name, in this second database.

That means all you have to remember is the branch name you're using. You say "get me the latest main commit" or "get me the latest dev commit", and Git fishes out the hash ID from the names database and then uses that to fish the commit out of the big all-objects database.

In other words, you don't need to memorize hash IDs. You do still need to know that the commits are numbered , using these funny-weird hexadecimal OID or hash ID thingies, and that there's one per commit. But you don't have to memorize any of them: just run git log , for instance, and Git itself will show them to you, and you can use your mouse to cut-and-paste a hash ID if you need one (eventually, you will need one, now and then—maybe very rarely, maybe once a week or twice a day or something, but someday, probably).

What to know about a commit

Besides being numbered like this, a commit:

  • Is completely read-only. This is required for the numbering system to work, but it means that you literally cannot change anything about any commit, once you make it. This isn't really a big deal because commits in Git normally hardly take any space at all.

  • Stores two things:

    • Each commit has a full snapshot of every file .

    • Each commit has some metadata , or information about the commit itself.

The first sub-bullet-point seems to contradict the claim that commits are normally pretty tiny. If every commit holds every file , won't the repository database bloat up enormously? And it would, except for some very clever Git tricks. The number one trick is that the files in the commit are stored in a special, read-only, Git-only format (as objects in that big all-objects database, in fact) in which duplicated file contents are de-duplicated .

Since most commits mostly re-use most of their files from a previous commit, and such files are automatically de-duplicated , they take no space! It's only the changed files that actually take any space. Git later—not right away—compresses those as well; as long as they are normal text or programming-language contents, this usually works tremendously well. (For binary files, though, it usually fails, which is why Git is ill-suited to storing most binary files.)

So this—the compressed and de-duplicated files as Git objects—is how Git stores every file in every commit without actually storing every file in every commit, and hence without having the repository grow obscenely fat. You don't need to know this to use Git, you just need to know that every file seems to be stored, forever, in every commit. That is, make a commit, and you can get all your files back, forever—or rather, for as long as you can find that commit. You'll need its hash ID, (Do you see why hash IDs are important? now?)

But I just said that you don't need to memorize hash IDs, and that's true. So how does that work? Well, let's look more closely at that metadata. Every commit stores information about itself, and that information includes things like the name and email address of the person who made the commit, and some date-and-time stamps, and so on. But there's a crucial bit of information in here for Git itself: Every commit stores the hash ID of the commit that comes before it.

More precisely, each commit has a list of previous commit hash IDs. But this list is usually exactly one element long. That one element, the one list entry, holds the parent commit's hash ID. This means commits have parent/child relationships, with most commits having just a single parent (mother? father? pick whatever you like here, Git is gender-agnostic).

Because a commit is totally read-only, the child commit can remember its parent's "name" (hash ID), as the parent exists when we create the child. But the parent can't remember its childrens' names, because its children—if there will be any—don't exist yet. As soon as the child is born, it's cryogenically frozen, and can't learn its future childrens' names.

We say that the child points to its parent, and if we want, we can draw some commits this way. Using single uppercase letters to stand in for real hash IDs, we'll call the last commit in the branch H (for Hash) and draw it like this:

            <-H

That arrow sticking out of H is how H uses its parent hash ID storage, in its metadata, to point to its parent. We'll draw its parent in now, as the letter G since that comes before H :

        <-G <-H

Of course G points backwards to its parent:

... <-F <-G <-H

and as you can see, this makes an endless backwards-pointing chain except that history eventually runs out when we get back to the very first commit ever. That first commit has no parent, like some sort of virgin birth but without even a mother, so that commit—let's call it A —just doesn't point backwards after all:

A--B--...--G--H

We can get lazy and stop drawing the arrows as arrows because we know they're part of the commit and can't be changed and therefore must point backwards.

Branch names find commits

The problem with the above is that you still have to memorize hash ID H , the last commit in the chain: the tip of the chain. We've already seen, though, that a Git repository contains a second database of names, and that the branch names hold the hash ID of the last commit. As with commits pointing to earlier commits, we say that the branch name points to the tip commit , and we draw that in like so:

...--G--H   <-- branch

To add a new commit to the branch, Git will simply write out the new commit with its full snapshot and metadata, acquiring the new unique hash ID in the process. The hash ID will look random; we'll call it I here, the next letter after H . Commit I will point to commit H , the way any commit has to point backwards to its parent:

...--G--H   <-- branch
         \
          I

and because new commit I is the new tip of the chain of commits, Git will now write I 's hash ID into the branch name in the database, so that the name now points to I :

...--G--H    branch
         \  ↙︎
          I

(the arrow I used here is kind of lame; this is why I get lazy about the arrows between commits). We don't need I on a separate line after all: it makes more sense to draw this as:

...--G--H--I   <-- branch

Note that the arrow sticking out of a branch name can and does move, all the time . This makes it very different from the rigid, backwards-pointing arrow sticking out of a commit, pointing to the commit's parent. Git simply defines the branch name as "this is the last commit". That is, whatever hash ID is in the branch name, that's the last commit on the branch. So to change which commit is the last commit , you have Git stuff a new hash ID into the branch name. This not only lets you add commits to the branch, but also lets you remove commits from the branch:

       H--I   ???
      /
...--G   <-- branch

If we have Git store G 's hash ID in the branch name, we've dropped commits H and I from the branch. They're still in the repository , but now you need to know their hash IDs! If you don't know the hash ID of commit I , you'll never find it again. Git can work backwards —given a name branch pointing to commit I , Git can follow the commit-to-parent arrows for you; that's how git log works—but Git can't work forwards . There are no forward pointing arrows!

Clones

Besides the above, Git offers the ability to clone a repository. Here's how clone works:

  • You run git clone and give it a URL.
  • Git makes a new, totally empty repository: two databases (for commit and other objects, and for branch and tag and other names), but neither of these has anything in it at all.
  • Your Git software saves this URL in the new repository (in an auxiliary "database" that's actually just a simple file like an INI file ).
  • Your Git software calls up some other Git software, using the URL you provided. They respond by listing out all their branch and tag and other names, and the hash IDs that go with those.
  • Your Git software says "gimme all those objects and all their parents and grandparents and everything that makes up the history"—ie, the full contents of their commits-and-other-objects database.
  • They send all that stuff over, and your Git (your software working with your repo) sticks it into your database.
  • For each of their branch names, your Git software changes those names into your remote-tracking names . Their main or master becomes your origin/main or origin/master ; their develop , if they have one, becomes your origin/develop ; and so on.

You end up with your two databases being full of stuff:

  • You have a full copy of their objects database (well, mostly-full: if they have some objects they can't find, due to dropping commits off the end of branches for instance, they don't need to send you those).
  • You have no branch names . (Whether you have branches depends on what you mean by the word branch .) Instead of branch names, you have remote-tracking names that start with origin/ . These are how you will find the commits, for now.

As a final step, your Git now creates one branch name of your own . For instance, if you told git clone to git clone -b develop , your Git will create your own branch name develop . If you didn't use the -b option—and most people don't—your Git asks their Git what name they recommend , and creates that name.

In either case, the commit your new branch has, as its tip commit, is the same commit hash ID in their name that on their side is spelled the same way . That is, if you let your Git create main because they recommend main , the commit your main has as its last commit is the same commit that your origin/main selects, which is the same commit that their main selected at the time you ran git clone .

By now—it's probably been seconds , which on a computer is loads of time—their branch names may select different commits. But at the time you ran git clone , their branch names selected particular commits. Your Git remembers all these, using your repository's remote-tracking origin/* names. Your repository has your own branch names, which you get to create and update however you like, which are different from their branch names, even if you use the same spellings. That's because their databases are not your databases.

What you and they share are the commits . Those are read-only—neither of you can change them—and have hash IDs that every Git in the universe agrees are the right hash IDs, via the hash ID magic. So your clone of the original ( origin ) repository, and that original repository, are very closely related. You have your own branch names , but you share the commits .

Fetch and push

Given any two repositories—usually, ones that are related , although this isn't actually necessary at the start—we can transfer commits from one repository to another:

  • The way to get commits from them is to use git fetch .
  • The way to give commits to them is to use git push .

In both cases, we must give our Git—our software running with our repository—the URL of the other Git (their software and their repository). If we have a closely-related repository that we made by cloning, we already have the URL, as our Git saved it under the name origin . So we just run:

git fetch origin

and our Git calls up their Git. They list out their names and hash IDs, and our Git can tell whether we already have some commits—because hash IDs are the true names of objects, and the commits have unique hash IDs—or whether we need them. If we need them, our Git asks for them, which automatically asks for the sender to send the parent hash IDs so that we can see if we have those too. In this way, we have our Git get, from them, all the commits they have that we don't . We don't ask for any of the commits we already have. Git uses this information to figure out not only which commits are new to us, but also which files in those commits are new, and sends us just the new stuff.

Our Git then sticks all the new commits and supporting objects into our objects database. We now have all their commits, plus any we've never given to them. Then our Git updates our remote-tracking names because we know what hash IDs their branch names held at the time we ran git fetch and we have all those commits, so our Git can update our memory of their branch names.

In fact, this git fetch is how the main part of git clone works: all git clone does is create the empty repository, stash the URL away under the name origin , run git fetch , and run one final operation to create and check out a branch name. So if you connect two unrelated repositories with git fetch , once the fetch is done, the two repositories are now related.

The git push command is as close as Git comes to the opposite of git fetch . ( Remember this: git push 's opposite is git fetch , not git pull . This was a mistake in naming that Git made early on and we're just stuck with it now. You just have to memorize it: push/fetch, not push/pull.) There are a few big differences though:

  • With git fetch , the action is "get commits and update remote-tracking names". The default is all new-to-us commits and all remote-tracking names.

  • With git push , the action is send new commits. We must pick which commit our side will start with to offer them new-to-them commits. So we run git push origin somebranch here, to offer to them any new-to-them commits we find with our branch name somebranch . We'll start by offering them just the one tip commit. They presumably don't have it, so they'll say "oh yes, send that one", which means we must offer its parent. If they don't have that they'll ask for it too, which means we offer the grandparent, and so on. Eventually we get back to some commit that they already have—because we got it from them by cloning— or we go all the way back through our history and get to our very first ever commit, that has no parent so we say "that's all there are".

    Once we know which commits are new-to-them, our Git uses its smarts to package up just those commits and the files that are new-to-them as well, and we send over that stuff and they stick it into their repository's objects database. But now we, or they, have a bit of a dilemma. How will they find these objects? They are going to need a name .

    When we used git fetch , we created or updated a remote-tracking name to remember the tip commit of one of their branch names. With git push , they don't have remote-tracking names for us. Instead, we ask them—politely by default—to see if they can, please, create or update one of their branch names to remember our latest commit.

    If somebranch is a new name to them, they can just create that name. That won't disturb any of their existing names. So that's easy, they can just say "ok", do it, and we're all good. But if somebranch is one of their branch names, what they'll do now is check to make sure we're just adding commits to their branch .

Let's go back to our earlier illustration. Suppose we have these names:

...--G--H   <-- main
         \
          I   <-- develop

Let's suppose further that we have remote-tracking names that we got from origin a few seconds, or hours or days, ago:

...--G--H   <-- main, origin/main
         \
          I   <-- develop, origin/develop

We now use git switch develop to pick name develop as our current branch name:

...--G--H   <-- main, origin/main
         \
          I   <-- develop (HEAD), origin/develop

This trick, of adding the special name HEAD to the drawing and "attaching" it to one of the branch names, is how we show which branch name we are using in our own Git repository.

Let's make a new commit now, in our Git repository, in the usual way (edit files, git add , git commit , write a commit message, etc). We get this:

...--G--H   <-- main, origin/main
         \
          I   <-- origin/develop
           \
            J   <-- develop (HEAD)

We now run git push origin develop . We'll offer them commit J , and they will say: Huh, new hash ID, OK, send me J . We'll offer them commit I next, because that's J 's parent. They'll say No thanks, already have that one. We now know all, or almost all, the commits they have! They have I , but that means they have H , which means they have G , and so on, all the way back to the very first commit!

So, our Git will package up commit J and any new-to- J files that aren't duplicates of a file in, say, H (maybe we put a file back to the way it was before). We'll send that over, and then we'll ask them, politely: Please, if it's OK, set your name develop to point to commit J (by commit J 's real hash ID of course).

If their develop —as represented here by our memory, origin/develop —still points to commit I , commit J adds on to their develop . So they'll say OK, done to our polite request and we'll know that their develop now names commit J , and our Git will update our picture like this:

...--G--H   <-- main, origin/main
         \
          I--J   <-- develop (HEAD), origin/develop

But suppose that, during these seconds or hours or days, someone else added some other new commit K to their develop . That is, they have:

...--G--H   <-- main
         \
          I--K   <-- develop

in their repository (these are their names so we don't have to know or care where their HEAD is, and they don't have J yet).

We send them a new commit J , which they stick in their database:

...--G--H   <-- main
         \
          I--K   <-- develop
           \
            J

and then we ask them, politely, to please-if-it's-OK move their develop to point to J . But this time it's not OK! If they did that, they'd "lose" their commit K . So they will say: No, if I do that, I will lose something. (Git calls this a "non-fast-forward", in Git's usual inscrutable way.) We'll get an error:

 ! [rejected]    develop -> develop (non-fast-forward)

What we need to do about this error is, usually, run git fetch , which will get us commit K , so that we now have:

...--G--H   <-- main, origin/main
         \
          I--K   <-- origin/develop
           \
            J   <-- develop (HEAD)

Once we have that, we can decide:

  • Is commit K good? Or, should we ask them to throw it away?
  • If commit K is good, what do we want to do about this?

Our primary choices for "what we want to do" is to use either git rebase or git merge . It's very common to run git fetch and then want to run either git rebase or git merge . That's why git pull exists: it runs git fetch , and then runs either git rebase or git merge . The drawbacks to git pull are many:

  • We don't get to look at commit K . We just assume it's good.
  • We don't get to look at commit K before we decide whether to merge or rebase.
  • We don't even know, if we're beginners at Git, that we're doing all of this.
  • We don't know what git merge and git rebase are!
  • We don't know what to do when git merge or git rebase can't finish , which happens often enough to matter a lot.
  • We don't even know which of git merge or git rebase we picked! We can't know what to do next because we have no idea what's going on.

So, if you're new to Git, don't use git pull (yet). Later, you may want to use it, once you know all this stuff by heart. I still mostly don't use it myself, and I've been using Git for almost two decades. I don't like git pull very much. (In the last year or so, it's grown a new mode that does what I want, but I'd still rather just run two commands.)

Conclusion

Before you worry about whether you should shove another project into a repository (you certainly can , but maybe you shouldn't , this is a matter of judgement), learn what a repository does for you. Decide whether you want to map "project" to "repository" one-to-one, or many-to-many, or many-to-one, or one-to-many, or whatever. There are pluses and minuses to the "monorepo" (everything in one repository) and "polyrepo" (many repositories for one or many projects) approaches. You won't always get all this right the first time, but be aware that this is what you're doing here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM