简体   繁体   中英

How can I use Git to identify function changes across different revisions of a repository?

I have a repository with a bunch of C files. Given the SHA hashes of two commits,

<commit-sha-1> and <commit-sha-2> ,

I'd like to write a script (probably bash/ruby/python) that detects which functions in the C files in the repository have changed across these two commits.

I'm currently looking at the documentation for git log , git commit and git diff . If anyone has done something similar before, could you give me some pointers about where to start or how to proceed.

That doesn't look too good but you could combine git with your favorite tagging system such as GNU global to achieve that. For example:

#!/usr/bin/env sh

global -f main.c | awk '{print $NF}'  | cut -d '(' -f1 | while read i
do
    if [ $(git log -L:"$i":main.c HEAD^..HEAD | wc -l) -gt 0 ]
    then
        printf "%s() changed\n" "$i"
    else
        printf "%s() did not change\n" "$i"
    fi
done

First, you need to create a database of functions in your project:

$ gtags .

Then run the above script to find functions in main.c that were modified since the last commit. The script could of course be more flexible, for example it could handle all *.c files changed between 2 commits as reported by git diff --stats .

Inside the script we use -L option of git log :

  -L <start>,<end>:<file>, -L :<funcname>:<file> Trace the evolution of the line range given by "<start>,<end>" (or the function name regex <funcname>) within the <file>. You may not give any pathspec limiters. This is currently limited to a walk starting from a single revision, ie, you may only give zero or one positive revision arguments. You can specify this option more than once. 

See this question .

Bash script:

#!/usr/bin/env bash

git diff | \
grep -E '^(@@)' | \
grep '(' | \
sed 's/@@.*@@//' | \
sed 's/(.*//' | \
sed 's/\*//' | \
awk '{print $NF}' | \
uniq

Explanation:

1: Get diff

2: Get only lines with hunk headers; if the 'optional section heading' of a hunk header exists, it will be the function definition of a modified function

3: Pick only hunk headers containing open parentheses, as they will contain function definitions

4: Get rid of '@@ [old-file-range] [new-file-range] @@' sections in the lines

5: Get rid of everything after opening parentheses

6: Get rid of '*' from pointers

7: [See 'awk']: Print the last field (ie: column) of the records (ie: lines).

8: Get rid of duplicate names.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM