简体   繁体   中英

How to see the file size history of a single file in a git repository?

Is there anyway to see how a file's size has changed through time in a git repository? I want to see how my main.js file (which is the combination of several files and minified) has grown and shrunk over time.

You can use either git ls-tree -r -l <revision> <path> to get the blob size at given revision, eg

$ git ls-tree -r -l v1.6.0 gitweb/README
100644 blob 825162a0b6dce8c354de67a30abfbad94d29fdde   16067    gitweb/README

The blob size in this example is '16067'. The disadvantage of this solution is that git ls-tree can process only one revision at once.

You can use instead git cat-file --batch-check < <list-of-objects> instead, feeding it blob identifiers. If location of file didn't change through history (file was not moved), you can use git rev-list <starting-point> -- <path> to get list of revisions touching given path, translate them into names of blobs using <revision>:<path> extended SHA-1 syntax (see git-rev-parse manpage), and feed it to git cat-file . Example:

$ git rev-list -5 v1.6.0 -- gitweb/README | 
  sed -e 's/$/:gitweb\/README/g' |
  git cat-file --batch-check
825162a0b6dce8c354de67a30abfbad94d29fdde blob 16067
6908036402ffe56c8b0cdcebdfb3dfacf84fb6f1 blob 16011
356ab7b327eb0df99c0773d68375e155dbcea0be blob 14248
8f7ea367bae72ea3ce25b10b968554f9b842fffe blob 13853
8dfe335f73c223fa0da8cd21db6227283adb95ba blob 13801

Create a file called .gitattributes and add the following line:

main.js -diff

This turns off line-based diffs for main.js . Now run the following command:

git log --stat main.js

The log will include lines like

main.js | Bin 4316 -> 4360 bytes

After you're done, you should probably delete .gitattributes . I don't know what other changes in git's behavior may be caused by the -diff attribute.

Tested with git versions 1.7.12.4 and 1.7.9.5.

Source: ewall's answer and https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html#_marking_files_as_binary

You could create a script that uses the output from git show --pretty=raw <commit> to obtain the tree, then uses git ls-tree -r -l to obtain the blob you are looking for, including the file size.

In case you have ruby and the grit gem installed, here's a little script I threw together:

require 'grit'

if ARGV.size < 1
  puts 'usage: file-size FILE'
  puts 'run from within the git repo root'
  exit
end

filename = ARGV[0].to_s

repo = Grit::Repo.new('.')
commits = repo.log('master', filename)
commits.each do |commit|
  blob = commit.tree/filename
  puts "#{commit} #{blob.size} bytes"
end

Example usage (filename of script is file-size.rb), will show you the history for somedir/somefile:

myproject$ ruby file-size.rb somedir/somefile

Here is a Bash function that will report the size over time in the following format.

 LoC  Date                       Commit ID   Subject
 942  2019-08-31 18:09:34 +0200  35fc67c122  Declare some XML namespaces in replacement of OGCPrefixMapper, which has been removed from Apache SIS. https://issues.apache.org/jira/browse/SIS-126
 943  2019-08-09 16:52:29 +0200  e8438ab869  fix(GML): fix relative path resolving inside a jar
 934  2019-08-05 15:37:46 +0200  1e0c0b03c4  fix(GML): fix all test cases
 932  2019-07-30 15:54:53 +0200  fddea5db24  feat(GML): work on fallback for non-xsd Feature store
 932  2019-07-23 16:40:23 +0200  8d9a6a7dd0  feat(GML): improve support for custom XML mappings
 932  2019-06-26 15:18:43 +0200  43ea6e0bd7  feat(GML): add concurrency support for read/write operations
 932  2019-06-21 09:27:41 +0200  07a9993b4b  feat(GML): support group reference min/max occurs attributes
 932  2019-06-21 09:27:41 +0200  352a9104ae  feat(GML): fix resolving local files xsd paths
 919  2018-06-08 15:41:26 +0200  01ac7538e7  Merge branch 'master' into sis-migration
 919  2018-05-16 16:40:04 +0200  16fe7590c5  fix(JAXP): various fix for  WFS 2.0.0
 912  2018-04-11 10:09:22 +0200  bf3a38bdc4  chore(*): update JTS version 1.15.0
 912  2017-11-09 20:15:23 +0100  bc14dc4be1  fix(Client): fix minor problems on WFS querying
 901  2017-10-20 11:41:43 +0200  f686d7ff15  feat(Storage): add support for GML 2.1.2
 882  2017-05-16 23:07:31 +0200  f20c34c1e2  refactor(Feature): renamed the Geotk flavor of org.apache.sis.feature package as org.geotoolkit.feature.

Here is the function:

git-log-size() {
    git rev-list HEAD -- "$1" | while read cid; do
        git cat-file blob "$cid:$1" | wc -l | tr -d '\n'
        echo -n $'\t'
        git log -1 "--pretty=%ci%x09%h%x09%s" $cid
    done | column -t -s$'\t'
}

It is not particularly efficient, but does the job. It uses some utilities which are pretty common (wc, tr, column).

The size is reported as lines of code (LoC) since this is the common metric in software development, just change the "-l" option of wc if you prefer something else.

Here is how to call it:

git-log-size <path>

In case this is of use for someone, this script will show the size of a given file in different commits:

git log <file_name> | grep "^commit" | cut -f2 -d" " | while read hash; do
   echo -n "$hash -- "
   git show $hash:<file_path_off_of_git_root_without_leading_slash> | wc -c
done

While commands like git log <filename> , git whatchanged , etc. can show the history pertaining to that file, I don't see anywhere in either the built-in or custom pretty formats an option that shows size (sadly, the --log-size option is only for the log messages!).

However, you can get a rough idea of the size by seeing the total number of lines added and removed in each commit. You can sort of visualize it with the command git log --stat <filename> , which uses plus and minus signs. Or use git log --numstat <filename> to collect the number of lines added or removed in each commit and use the numbers in some other visualization.

On Windows I am using the following command:

cmd /c "@echo off & for /l %N in (1 1 30) do git ls-tree -r -l HEAD~%N "C:\path\to\file.txt"

It will show size of each of latest 30 versions.

If somebody can convert that to Linux command you are welcome... ))

Simply do:
git log --stat /path/to/file

Result: 在此处输入图片说明

Bash function that lists out the size of a file by revision:

function git-filehist() {
  for rev in $(git rev-list HEAD -- $1); do
    git ls-tree -r -l $rev $1
  done
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM