I have a script that I wrote to help automate large pull requests to master in git. I'm trying to get rid of the user need to log in when doing things like pull to update the branch, so I've been trying to figure out how to do this. So, I ended up creating a personal token in Bitbucket Server to see if I could get it to work for myself, and it does work. A personal token wouldn't work for everyone, but I was hoping to work out the right syntax to test it.
The command I came up with is this:
subprocess.check_call(['git', 'pull']+[f'https://{username}:{MYTOKEN}@{repo_url}'], cwd=repo_path)
But I get really weird behavior from it where it pulls in a bunch of files from the script I made, and then a bunch that I didn't touch. In both cases, I never pushed anything to the remote branch, or committed the local branch. There's nothing in the staged area either.
So, I tried this to see what would happen, and I get correct behavior where it says my repo is up-to-date and there's nothing to pull. This matches the manual git pull that I had been doing since I didn't actually touch any files. But it requires the user to enter credentials which is what I was trying to get rid of.
subprocess.check_call('git pull',cwd=repo_path)
Any idea what would cause something like this?
First, let me say that you generally should not use git pull
here, because you're writing a script that should not be interactive, and git pull
tries to be interactive when it runs its second command. (There are some ways to work around this, especially in modern versions of Git, but it will help to break things up into the two separate steps here.)
With that out of the way, regardless of whether you use git pull
to run git fetch
, or run git fetch
yourself, there is one key difference between:
git <command>
and
git <command> https://username:password@bitbucket.server/path/to/repo
and that is that the second command provides a URL. This key difference actually matters twice, as we'll see.
The first command lacks any URL-or-remote argument, so it looks up the remote based on the current branch , or uses origin
as the remote if there is no remote for the current branch (or is no current branch, in the case of being in detached HEAD state). This is true regardless of whether the command here is fetch
or pull
because git pull
invokes git fetch
with the argument you provided. So either way, we run git fetch
: one way with no extra arguments, and the other way, with a URL.
When git fetch
is given a remote , this enables some nice features. In particular, git fetch
will update the corresponding remote-tracking names . This isn't the immediate source of the problem, but updating remote-tracking names is a good thing. Providing a URL prevents git fetch
from updating the remote-tracking names. This isn't crucial, but it's a problem that you can't easily solve. It's just something to keep in mind: it will be a minor, but constant, annoyance later.
More importantly, though, this affects the second command that git pull
runs. When git fetch
finishes, it writes a file named FETCH_HEAD
in the Git repository directory (typically .git
). When running git fetch
with a remote , we end up with contents like this:
$ cat .git/FETCH_HEAD
1c52ecf4ba0f4f7af72775695fee653f50737c71 branch 'master' of <url>
898f80736c75878acc02dc55672317fcc0e0a5a6 not-for-merge branch 'maint' of <url>
bcca9488540da62a407e744ef77a8abcf8e92efe not-for-merge branch 'next' of <url>
1c4d5706c6ff6a04567b24d4b3168b09793a83f9 not-for-merge branch 'seen' of <url>
32af5571f1841d138c786b68d4ec8c6a07752540 not-for-merge branch 'todo' of <url>
a8eaf9de52c2d49799d7dc724e688ccbfa74390c not-for-merge tag 'v2.30.0-rc0' of <url>
When we run the same command with a URL—even the same URL that git fetch origin
would use—we get instead:
1c52ecf4ba0f4f7af72775695fee653f50737c71 <url>
Note that all the various branch names, and the not-for-merge
lines on the branch names that should be ignored for the next step, are missing. The only hash ID in the FETCH_HEAD
file is the one corresponding to the HEAD
in the other Git repository at the given url
.
The second command that git pull
runs is:
git <command> <options> <hash>
The command
part here is normally one of git merge
or git rebase
(there's one very special case that won't apply here where it is neither of these). The options
depend on the command since git merge
gets a -m
option while git rebase
does not (but can get other options). The hash
is the real problem here.
The hash ID that git pull
supplies to either git merge
or git rebase
comes out of that .git/FETCH_HEAD
file. When using a remote, one particular line of that file will correspond to the upstream of the current branch , and that's the hash ID that Git will use. But when git fetch
was given a URL instead of a branch name, the fetch command wrote only one hash ID: that of the other Git repository's HEAD
. If this isn't the right hash ID, your second command will use the wrong hash ID.
This is almost certainly what is happening.
You can fix it by:
git pull
so that it can pass it on to git fetch
, and/orgit fetch
yourself, then running a second Git command yourself if/as needed. Given that git pull
is designed to be interactive and hence changes the way it behaves based on the user's preference configuration settings, I'd suggest doing both: run git fetch
yourself, then figure out what second command you'd like to run.
The fetch may still need the access token. (Note, by the way, that passing this on the command line makes the access token readable to other processes on the machine, so it's not very secure. It's perhaps a bit more secure to set up a remote that holds the token, in a file that's secured, then use a remote name. The remote name will control the remote-tracking names, which continue be that ongoing annoyance I mentioned earlier. There are further workarounds for this but once you have some remote-tracking names, the annoyance level is really fairly low, and this might be good enough.) But now, regardless of how you do it, it can get the name of any branch or tag you want resolved in the other Git repository, so as to find the right commit.
The second command can still be git merge
, perhaps with --ff-only
, or rebase, with a check that it succeeds and a rollback on failure—or perhaps even a git checkout
to use a detached HEAD rather than an attempt to change any existing branch names in the local repository. The important thing, though, is that by knowing that git pull
really means fetch, then run a second command , and knowing what the various options are here, you can take control of all the parts of the two operations. You won't be at the whims of git pull
using the wrong options and/or wrong branch names.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.