Assume we have two git repos, one a submodule of the other ( A
will be the superproject, B
will be the submodule). Project A
is not source code per-se, rather a project that gathers and tracks information about its submodule(s). The A
repo rarely, if ever, exists on local machines, rather a bunch of scripts keep it updated.
One day, someone realized that repo B
should have been using LFS better and cleaned up the repo using git lfs migrate import
. I have a list of B
's old hashes and new hashes.
As repo A
happens to linear (no branching), I was able to do a git rebase --root -i
, change all the commits to edit
, and run a simple bash script that reset the submodule to the new hashes. Here's an example of the script:
#!/bin/bash
#set the submodule path and input files
submodulePath=foo
newHashesFile=NewHashes.txt
originalHashesFile=OriginalHashes.txt
while [ (test -d "$(git rev-parse --git-path rebase-merge)" || test -d "$(git rev-parse --git-path rebase-apply)" ) ]; do
numLines=`git ls-files --stage | grep $submodulePath | wc -l`
if [ $numLines = 1 ];
then
oldHash=`git ls-files --stage | grep $submodulePath | sed -e 's/^160000 \([^ ]*\) 0.*$/\1/g'`
echo oldHash: $oldHash
else
echo merge conflict
oldHash=`git ls-files --stage | grep $submodulePath | grep '^160000 \([^ ]*\) 3.*' | sed -e 's/^160000 \([^ ]*\) 3.*$/\1/g'`
echo oldHash: $oldHash
fi
lineNumber=`grep -n $oldHash $originalHashesFile | sed -e 's/^\([^:]*\):.*/\1/g'`
newHash=`head -n $lineNumber $newHashesFile | tail -n 1`
if [ ! $lineNumber ];
then
echo Hash not changed
else
cd $submodulePath
git reset --hard $newHash
cd ../
fi
git add $submodulePath/
git commit --amend
git rebase --continue
done
All this worked, but I was wondering if there is an easier simpler way to do so, as I assume I'll be called on to do this again. There are two parts to that question.
edit
instead of pick
, not dependent on the editor?git lfs migrate import
from within the superproject?Is there a simple way to tell git that you want the default to be edit instead of pick, not dependent on the editor?
No. There is, however, a way to set the sequence-of-commands editor to a separate editor from other editors: set the environment variable GIT_SEQUENCE_EDITOR
. So, for instance, you can do:
GIT_SEQUENCE_EDITOR="sed -i '' s/^pick/edit/" git rebase -i ...
(assuming your sed
has a -i
that works this way, etc).
Is there a simpler way of telling git to do what the script does?
Given that you want to update each gitlink hash, I'd use git filter-branch
(rather than git rebase
) to do it, with an --index-filter
that does the gitlink hash updates. I'm not sure this is any simpler but it's more direct. The index filter itself would consist of using git ls-files --stage
similar to the way you do it, but probably itself use a generated sed
script, or an awk
script. Generated-sed would probably be faster, while awk would be simpler, especially if you have a modern awk where you can just read in the hash mapping.
After having to do this a few times over the years, I took torek's advice and wrote my overly verbose bash script as a single git filter-branch
. I'm posting it here, both for other users and future me.
First, just to clarify how I did the lfs migrate import
(and I'm sure I took the long route for some of these lines):
# Make sure we have the up-to-date remote branches
git submodule update --init SubmodulePath/
cd SubmodulePath/
git fetch --all
# Create local branches that mirror the remote ones
git branch -lr | grep -v "origin/HEAD" | sed 's/^.*origin\///' |
xargs -I @ git branch @ origin/@ --force
#Find all files that git identifies as binary and create the lfs migrate command, then run it
git log --all --numstat | grep '^-' | cut -f3 | sed 's|^.*/\(.*\)|\1|' | sed 's|^.*\.\([^.]*\)|\1|' |
sort -u --ignore-case | sed 's|\([^0-9]\)|[\L\1\U\1]|g' | awk '{print}' ORS=',*.' |
sed 's|^\(.*\),\*\.$|git lfs migrate import --everything --object-map=LFSImport.txt --include="*.\1"|' | . /dev/stdin
I then moved LFSImport to a different directory (I also committed it to the submodule repo) and ran the filter-branch
with index-filter
:
git filter-branch -f --index-filter '
numLines=`git ls-files --stage | grep SubmodulePath | wc -l`
if [ $numLines = 1 ];
then
echo
oldHash="$(git rev-parse --quiet --verify :SubmodulePath)"
echo oldHash: $oldHash
newHash="$(grep $oldHash /path/to/LFSImport.txt | cut -d , -f2)"
echo newHash: $newHash
git update-index --add --cacheinfo 160000 $newHash SubmodulePath
fi
' HEAD
I probably should have added a check on $newHash
to see if it wasn't empty (it was in one commit of mine, but I manually just set it to something else that didn't exist). As torek mentioned, this was cleaner, faster and worked just as well, if not better.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.