Add submodule to git repository but to specific commit

Question

I want to add a git repository as a submodule. But instead of using the most recent version of this external repository, I want a previous commit. Using git submodule add adds the head of the repository, but it does not add the commit history. Therefore, I cannot checkout to a previous commit. How could I do that?

Answer 1

I have to guess here, because (as larsks noted in a comment ) you have underspecified your actual problem. But usually the issue here is that you have forgotten that, when using submodules, you typically have a minimum of four repositories you must manipulate, in the correct order:

There's a superproject repository somewhere (eg, on GitHub) that you clone: let's call this the "server superproject repo";
there's a clone of the superproject repo, which is also a superproject repo, on your laptop (or other additional machine); I'll call it "your laptop" for concreteness;
there's a submodule repository somewhere (eg, on GitHub): let's call this the "server submodule repo"; and
there's a clone of the submodule repo on your laptop.

To get started, you run:

git clone <url>

on your laptop, where the url is that of the superproject on the server. This creates your own first laptop clone. Next, in your superproject clone on your laptop, you run:

git submodule update --init

(perhaps with --recursive if there are submodules in the submodule, which makes the submodule itself a superproject and multiplies the number of repositories even further, so let's assume there's just the one to keep things simple). This command enters the submodule's directory on the laptop within the superproject clone on the laptop and runs git clone for you, using the instructions in your laptop superproject clone .gitmodules file.

At this point we have all four clones: two on the server, and two on the laptop. There's some specific commit checked out in the laptop superproject repository, probably a commit selected by a branch name. That commit contains a gitlink that specifies a raw hash ID for some commit in the submodule clone , and that specific commit is now checked out in the submodule clone on your laptop .

You want a different commit selected on your laptop in the submodule clone . Assuming that commit exists in the clone on your laptop, you just need to:

(cd path/to/submodule && git switch --detach <hash>)

where you fill in the raw hash ID of the desired commit.

If the commit does not exist in your laptop clone you must obtain it from somewhere. Where? Well, one option is to run:

(cd path/to/submodule && git fetch origin)

which has your submodule Git repository contact the server's Git repository and fetch any new commits from them. They'll need to have the commit in question, If they don't have the commit—perhaps it's never been made (in which case? where did you get the hash ID,), or more likely, someone forgot to use git push to send it to the submodule repository on the server—you'll have to cause it to be made or to be sent to the server, or fetch it from the Git repository that does have it.

Let's say, for argument's sake, that the commit doesn't exist anywhere yet. That means you have to make the commit . The way you do this is:

cd path/to/submodule
git switch <some-branch-name>

You want to be "on" a branch when you make a new commit, because you want to git push this commit back to the server submodule , and it's easiest to do that while "on" some branch. You may need to create a new branch , if that's appropriate: you will do the same things you would do in any repository here.

You then do your work, updating files, using git add , and running git commit as usual. This creates a new commit in your submodule-repo laptop clone. This is the only place this commit exists as yet!

You may now wish to git push this new commit to the server. You can defer this git push but eventually this commit must go to the server's clone of the submodule , assuming it turns out to be the commit you want to use. Since you got yourself "on" a branch, pushing it is as simple as running:

git push origin thisbranch

or perhaps:

git push -u origin thisbranch

if you intend to keep working and adding more commits—this is a workflow issue and a matter of judgement and opinion about how to do work within the repository, which is the same kind of judgement you always have to make with any Git repository. The fact that the repository is being used as a submodule is not particularly relevant other than however that might affect your opinions and judgement here.

Now that the commits exist in two repositories—the submodule clone on your laptop and the submodule clone on the server—it's entirely safe to make the superproject depend on those commits. You do this by returning to the superproject repository clone on your laptop and running:

git add path/to/submodule

to record a new gitlink in the index for the working tree for the superproject clone on the laptop. You're presumably already on a branch in the superproject, which is why you didn't have to switch to (or create) one, but if you want to switch to (or create) one, you can do this before the git add step as usual.

Once you've git add -ed this, you might want to git add any other modified working tree files as usual. You're now ready to run git commit to create a new commit in the laptop repository for the superproject.

Once you've created this new superproject commit, you can git push the superproject commit from the laptop superproject repo to the server superproject clone. Since you've already made sure that the new submodule commit you made earlier exists in the server's submodule clone, it's safe to have the server's superproject clone refer to that commit by that commit's unique hash ID.

The pain comes from managing the multiple repos plus the detached-HEAD

Most of the pain with submodules occurs in two parts:

There are way more clones to deal with. You must manipulate all of them in the right order .
Submodules normally operate in "detached HEAD" mode: the superproject Git does a detached-HEAD mode git checkout or git switch in the submodule. As such, commits you make in the submodule are on a detached HEAD, making it hard to git push them back to the origin repository. Between the fact that people forget that this is needed, and the fact that even if they remember, they screw it up like this, the appropriate submodule commit sometimes isn't in the server's submodule, even though the server's superproject calls for that commit by hash ID!

You can't get these problems without submodules because a Git commit is a full snapshot of every file . You can't accidentally omit an entire folder or whatever. But a submodule is just a raw hash ID: you can accidentally omit the commit (from the server's clone) and that omits the entire folder! The superproject on the server calls for commit a123456 in the submodule, but nobody ever put commit a123456 anywhere. It's on Joe's laptop, and Joe is on an airline flight and won't even be able to deliver it back to the server for another ten hours. You can't get it, it's nowhere to be found.

Add submodule to git repository but to specific commit

Question

1 answers

solution1
0 2022-08-08 16:31:45

The pain comes from managing the multiple repos plus the detached-HEAD

Add submodule to git repository but to specific commit

Question

1 answers

solution1 0 2022-08-08 16:31:45

The pain comes from managing the multiple repos plus the detached-HEAD

solution1
0 2022-08-08 16:31:45