简体   繁体   中英

Get all versions of a file from a single branch

Given a single branch in a git repository, how do I grab all versions of a file? The final desired output is one new output file per version (although printing all of the versions to STDOUT is fine as well, as long as it's easy to determine the extent of each version).

Here's what I'm currently using:

branch=master
filename=...whatever...

myids=( $(git rev-list --objects $branch -- $filename | grep $filename | perl -e 'for (<>) { print substr($_, 0, 40) . "\n"; }') )

for (( i=0,j=1; i < ${#myids[@]}; i++,j++ )) 
    do
        echo $i $j
        name="output_$j.txt"
        git cat-file -p ${myids[$i]} > $name
    done

Explanation of what I'm doing:

  • use rev-list to find relevant hashes
  • strip the hashes of commits and trees, leaving just the hashes of the files (these lines also include the filename)
  • strip the filenames
  • run the hashes through cat-file and generate output files

The two main problems I have with this are that 1) it's not robust, because the grep could have false positives, and I'm not sure if it would follow file renames, and 2) it feels super hacky.

Is there a better way to do this, perhaps making better use of git's API?

Not the best solution, but it's one possibility

If this is just a one-off thing that you're trying to do, you could try this with the following script, but it uses Git porcelain from version 1.9.4 though , so it's definitely not a robust, reliable solution, since it's dependent on what version of Git you're using:

#!/bin/bash
mkdir temp
filepath=osx/.gitconfig

for sha in $(git log --format="%H" $filepath); do
  git show $sha:$filepath > temp/$sha.file
done

It simply uses git log to find all commits that modified the file:

$ git log --format="%H" osx/.gitconfig
338243aa6b68edad1dc3b2eebf66e108e9a4d685
7a4667138a519691386940ac23f9c8271ce14c77
475593a612141506f59a141e38b8c6a3a2917f85
03fa0711032cfdfc37fb431d60567ef22d75c7e5
3f7d8f0fc7e1d7a614f2aef8f53947ec2ce61296
c5fef8fccef3fc13f9dea17db209f2ceaab70002
287dadd8bcaf7e9197c6a16d57d3bacb72a41812
1f34ee1ab6965635a8f412bf3387f9dfdf197a1d

Then uses the <revision>:<filepath> syntax to output the version of the file from that revision.

git log can sometimes simplify your graph history though, so you might even want to pass the --full-history flag to git log , though I'm not exactly sure if it would be necessary for this particular use case.

How well would this follow renamed files though? You'd probably need to make the script a little smarter about keeping track of that, in order to use the right file path.

Again, however, I'd like to emphasize that this is not the best solution , a better solution would make use of Git plumbing commands instead, since they won't be so dependent on the Git version, and will be more backward and forward compatible.

Documentation

I would suggest using git-log on the file itself, instead of git-rev-list :

#!/bin/bash

filename=...whatever...

i=1
for hash in `git log --pretty=format:%H -- $filename`
do
    git show $hash:$filename > output_$i.txt
    ((i++))
done

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM