简体   繁体   中英

How do I rebase a git superproject changing the hashes of the submodules?

Background

Assume we have two git repos, one a submodule of the other ( A will be the superproject, B will be the submodule). Project A is not source code per-se, rather a project that gathers and tracks information about its submodule(s). The A repo rarely, if ever, exists on local machines, rather a bunch of scripts keep it updated.

One day, someone realized that repo B should have been using LFS better and cleaned up the repo using git lfs migrate import . I have a list of B 's old hashes and new hashes.

What I did

As repo A happens to linear (no branching), I was able to do a git rebase --root -i , change all the commits to edit , and run a simple bash script that reset the submodule to the new hashes. Here's an example of the script:

#!/bin/bash
#set the submodule path and input files
submodulePath=foo
newHashesFile=NewHashes.txt
originalHashesFile=OriginalHashes.txt

while [ (test -d "$(git rev-parse --git-path rebase-merge)" || test -d "$(git rev-parse --git-path rebase-apply)" ) ]; do
    numLines=`git ls-files --stage | grep $submodulePath | wc -l`
    if [ $numLines = 1 ];
    then
        oldHash=`git ls-files --stage | grep $submodulePath | sed -e 's/^160000 \([^ ]*\) 0.*$/\1/g'`
        echo oldHash: $oldHash
    else
        echo merge conflict
        oldHash=`git ls-files --stage | grep $submodulePath | grep '^160000 \([^ ]*\) 3.*' | sed -e 's/^160000 \([^ ]*\) 3.*$/\1/g'`
        echo oldHash: $oldHash    
    fi

    lineNumber=`grep -n $oldHash $originalHashesFile | sed -e 's/^\([^:]*\):.*/\1/g'`
    newHash=`head -n $lineNumber $newHashesFile | tail -n 1`

    if [ ! $lineNumber ];
    then
        echo Hash not changed
    else
        cd $submodulePath
        git reset --hard $newHash
        cd ../
    fi

    git add $submodulePath/
    git commit --amend
    git rebase --continue
done

Question

All this worked, but I was wondering if there is an easier simpler way to do so, as I assume I'll be called on to do this again. There are two parts to that question.

  1. Is there a simple way to tell git that you want the default to be edit instead of pick , not dependent on the editor?
  2. Is there a simpler way of telling git to do what the script does? Would it help if I did the git lfs migrate import from within the superproject?

Is there a simple way to tell git that you want the default to be edit instead of pick, not dependent on the editor?

No. There is, however, a way to set the sequence-of-commands editor to a separate editor from other editors: set the environment variable GIT_SEQUENCE_EDITOR . So, for instance, you can do:

GIT_SEQUENCE_EDITOR="sed -i '' s/^pick/edit/" git rebase -i ...

(assuming your sed has a -i that works this way, etc).

Is there a simpler way of telling git to do what the script does?

Given that you want to update each gitlink hash, I'd use git filter-branch (rather than git rebase ) to do it, with an --index-filter that does the gitlink hash updates. I'm not sure this is any simpler but it's more direct. The index filter itself would consist of using git ls-files --stage similar to the way you do it, but probably itself use a generated sed script, or an awk script. Generated-sed would probably be faster, while awk would be simpler, especially if you have a modern awk where you can just read in the hash mapping.

After having to do this a few times over the years, I took torek's advice and wrote my overly verbose bash script as a single git filter-branch . I'm posting it here, both for other users and future me.

First, just to clarify how I did the lfs migrate import (and I'm sure I took the long route for some of these lines):

# Make sure we have the up-to-date remote branches
git submodule update --init SubmodulePath/
cd SubmodulePath/
git fetch --all

# Create local branches that mirror the remote ones
git branch -lr | grep -v "origin/HEAD" | sed 's/^.*origin\///' | 
   xargs -I @ git branch @ origin/@ --force

#Find all files that git identifies as binary and create the lfs migrate command, then run it
git log --all --numstat | grep '^-' | cut -f3 | sed 's|^.*/\(.*\)|\1|' | sed 's|^.*\.\([^.]*\)|\1|' |
   sort -u --ignore-case | sed 's|\([^0-9]\)|[\L\1\U\1]|g' | awk '{print}' ORS=',*.' |
   sed 's|^\(.*\),\*\.$|git lfs migrate import --everything --object-map=LFSImport.txt --include="*.\1"|' | . /dev/stdin

I then moved LFSImport to a different directory (I also committed it to the submodule repo) and ran the filter-branch with index-filter :

git filter-branch -f --index-filter '
   numLines=`git ls-files --stage | grep SubmodulePath | wc -l`
   if [ $numLines = 1 ];
   then
     echo 
     oldHash="$(git rev-parse --quiet --verify :SubmodulePath)"
     echo oldHash: $oldHash
     newHash="$(grep  $oldHash /path/to/LFSImport.txt | cut -d , -f2)"
     echo newHash: $newHash
     git update-index --add --cacheinfo 160000 $newHash SubmodulePath
   fi
   ' HEAD

I probably should have added a check on $newHash to see if it wasn't empty (it was in one commit of mine, but I manually just set it to something else that didn't exist). As torek mentioned, this was cleaner, faster and worked just as well, if not better.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM