[英]diff in pre-receive hook
I have written a simple server side git pre-receive hook in Python. 我用Python编写了一个简单的服务器端git pre-receive钩子。 Goal is to analyze diffs and reject pushes that have certain text that we consider invalid.
目标是分析差异和拒绝包含某些我们认为无效的文本的推送。 I wrote the hook using below set of commands :
我使用以下命令集编写了钩子:
git ls-tree
git diff --name-only
git cat-file
however I just noticed that i am scanning entire files that are pushed as part of the commit. 但是我只是注意到我正在扫描作为提交一部分推送的整个文件。 But I only want to scan the diff ie the changed lines in this push.
但是我只想扫描差异,即在此推送中更改的行。
The reason for that is some invalid text can be false positive and is okay. 这样做的原因是一些无效的文本可能是误报,并且可以。 It can be force pushed.
可以用力推动。 However if the same file is edited again and valid text is added, the push will be rejected just because that file previously had invalid text.
但是,如果再次编辑同一文件并添加有效文本,则推送将被拒绝,因为该文件以前具有无效文本。 And this will happen each time the file is edited which is kinda annoying
每次编辑文件时都会发生这种情况,这很烦人
So basically the question is , how to get just the changed linesdiff in the current push on server side hook code instead of scanning complete files. 因此,基本上的问题是,如何仅在当前推送服务器端挂钩代码时获取更改后的linesdiff,而不是扫描完整文件。
Thanks 谢谢
... how to get just the changed lines
...如何只获得更改的行
This question is incomplete. 这个问题是不完整的。 Suppose I tell you that there are some people, including Alice, Bob, Carol, and so on.
假设我告诉你,有些人包括爱丽丝,鲍勃,卡罗尔,等等。 Now I tell you that Bob is different.
现在我告诉你鲍勃与众不同。 Different from who or what?
与谁或什么不同?
In a pre-receive hook, you must read lines from your standard input. 在预接收挂钩中,您必须从标准输入中读取行。 Each line has the form:
每行的格式为:
old-hash new-hash reference-name
What do these mean? 这些是什么意思? (That's an exercise for you to answer before you go on to the next sections, though the answer is embedded in the last section below.)
(尽管答案嵌入在下面的最后一节中,但是在继续下一节之前,这是您要回答的练习。)
A commit is a snapshot of files—complete copies of every file that was frozen into that commit. 提交是文件的快照-冻结到该提交中的每个文件的完整副本。 There are no differences involved;
没有差异 。 there are just complete files.
只有完整的文件。
You, however, want differences. 但是,您需要差异。 To get a difference for some file
file.ext
, you must pick some other version of file.ext
and compare the two. 为了使某些文件
file.ext
有所不同,您必须选择其他版本的file.ext
并将两者进行比较。 What is the correct "other version"? 什么是正确的“其他版本”?
For some commits, you are in luck: there's a very clear correct "other version" of file.ext
, which is: the copy of file.ext
in that commit's parent commit. 对于某些提交,您很幸运:
file.ext
有一个非常清晰正确的“其他版本”,即:该提交的父提交中file.ext
的副本。 In fact, this repeats for every file in the commit: we would like to compare that commit's version of that file, to the parent's version of that file, to see what changed. 实际上,对于提交中的每个文件都会重复此操作:我们想将那个文件的提交版本与该文件的父版本进行比较,以查看更改了什么。
There's a handy script-able ("plumbing") command for this, which is git diff-tree
: given the hash ID of an ordinary non-merge commit, git diff-tree
compares the commit's parent to the commit. 为此有一个方便的脚本命令(“管道”),它是
git diff-tree
:给定普通非合并提交的哈希ID, git diff-tree
将提交的父级与提交进行比较。 Add -p
or --patch
to get a textual difference (this automatically implies the -r
option). 添加
-p
或--patch
可获得文本差异(这自动意味着-r
选项)。 Consider using -U0
to drop context lines. 考虑使用
-U0
删除上下文行。 You will, of course, still need to parse the output lines, to detect hunk headers and the added/deleted markers. 当然,您仍然需要解析输出行,以检测块头和添加/删除的标记。
A simple git diff-tree <hash>
does not, however, work for two cases of commits: 但是,简单的
git diff-tree <hash>
不适用于两种情况的提交:
A root commit has no parent. 根提交没有父项。 Fortunately, the empty tree comes to the rescue:
git diff-tree -p $(git hash-object -t tree /dev/null) $hash
does the trick. 幸运的是, 空树可以解救:
git diff-tree -p $(git hash-object -t tree /dev/null) $hash
可以解决问题。
A merge commit has two or more parents. 合并提交有两个或多个父级。 Here
git diff-tree
producse a combined diff by default. 在这里
git diff-tree
默认会产生一个组合的diff 。 If that's OK, you can ignore this case. 如果可以,您可以忽略这种情况。 If not, you might consider using
--first-parent -m
or just -m
to split the merge and get multiple diffs, against each parent (default) or the first parent ( --first-parent
). 如果不是这样,您可以考虑使用
--first-parent -m
或只是-m
拆分合并并针对每个父级(默认)或第一个父级( --first-parent
)获取多个差异。
That gets you the diff for one commit, so now we move on to the last part. 这使您可以一次提交,因此现在进入最后一部分。
As you read each line, it's your job to: 阅读每一行时,您的工作是:
Check the old and new hashes for the special all-zero-digits null hash . 检查旧的和新的哈希值是否有特殊的全零数字null哈希值 。 In Python, there are multiple ways to express this;
在Python中,有多种表达方式。 one is:
一个是:
def is_null(hash): return all(i == '0' for i in hash)
If the old hash is null, the reference is being created at the new hash. 如果旧哈希为空,则在新哈希处创建引用。 If the new hash is null, the reference used to have the given old hash, and is being deleted.
如果新哈希为null,则该引用曾经具有给定的旧哈希,并且正在被删除。 Otherwise—neither hash is null—the reference is being updated: it had the old hash, and will have the new hash.
否则-两个哈希都不为空-引用将被更新:它具有旧的哈希,并且将具有新的哈希。
Figure out what to do, if anything, with the change to the particular reference. 弄清楚如何更改特定参考。 Is deletion allowed?
是否允许删除? Is creation allowed?
允许创作吗? Does it matter if this is a branch name (starts with
refs/heads/
) vs a tag name (starts with refs/tags/
) vs something else entirely? 这是分支名称(以
refs/heads/
开头)与标签名称(以refs/tags/
开头)还是其他名称无关紧要吗?
Creations are especially difficult. 创作特别困难。 The newly introduced name makes the given object reachable by that name.
新引入的名称使给定对象可以通过该名称访问。 If the object is a tag or commit, that makes additional objects reachable by that name as well.
如果对象是标记或提交,则也可以通过该名称访问其他对象。 Some or all of these objects may be new.
这些对象中的某些或全部可能是新的。 Some or all of these objects may already exist.
这些对象中的某些或全部可能已经存在。 The classic case is when someone creates a new branch name: it may point to an existing commit, already on some other branch, or it may point to a new commit, the new tip of the new branch, which may have many additional new commits before joining up with some existing branch(es).
典型的案例是有人创建了新的分支名称:它可能指向一个已经存在于其他分支上的现有提交,或者可能指向一个新的提交,即新分支的新提示,其中可能包含许多其他的新提交。在加入一些现有分支之前。
Updates are the most common, and usually the simplest to handle. 更新是最常见的,通常也是最简单的处理。 You know that the existing reference name made the old object reachable, and the proposed update is to make the new object reachable.
您知道现有的引用名称使旧对象可访问,而建议的更新将使新对象可访问。 If the reference is a branch name, both objects are in fact commit objects, and it is easy to find which commits, if any, are newly reachable from the proposed new hash, and which commits, if any, are being removed from reachability via the proposed new hash:
如果引用是分支名称,则这两个对象实际上都是提交对象,并且很容易从提议的新哈希中找到哪些提交(如果有的话)是新可访问的,以及哪些提交(如果有的话)通过可访问性被删除。建议的新哈希:
git rev-list $old..$new
produces the set of hash IDs that are newly reachable, and: 产生一组新近可到达的哈希ID,并且:
git rev-list $new..$old
produces the set that are no longer reachable. 产生不再可及的集合。 (Use
git rev-list --left-right $old...$new
, with three dots, to get both sets of hash IDs at once, with distinguishing markers. You can use $new...$old
: the symmetric difference that this produces is itself symmetric, except of course that the left and right sides are reversed.) (使用
git rev-list --left-right $old...$new
,带有三个点,可以同时获取两组哈希ID,并带有可区分的标记。您可以使用$new...$old
:对称这样产生的差异本身就是对称的,当然,左右两侧是相反的。)
Assuming you have handled creation somehow, if your goal is to examine newly-reachable commits—whether or not they are new to the repository overall—you can simply walk through all the new commits, testing each one to see if it is a root commit, an ordinary (single-parent) commit, or a merge commit. 假设您已经以某种方式处理了创建,如果您的目标是检查新可访问的提交(无论它们是否是整个存储库中的全部),则可以简单地遍历所有新提交,测试每个新提交以查看它是否是根提交。 ,普通(单亲)提交或合并提交。 (Hint: add
--parents
to the git rev-list
command to get the parent IDs included, so that you can easily tell how many parents each commit has. Also, consider the graph structure of the commit graph fragment you are walking: $old..$new
may include merges, which may make many commits reachable that may or may not be new to the repository.) (提示:在
git rev-list
命令中添加--parents
以获得父ID,以便您可以轻松知道每个提交有多少个父。此外,请考虑您要走的提交图片段的图结构: $old..$new
可能包括合并,这可能使许多提交成为可访问的,而对于存储库而言,这可能是新的,也可能不是新的。)
You now have all the commit hashes, and their parent counts. 您现在拥有了所有的提交哈希值,以及它们的父计数。 You also know how to use
git diff-tree
to compare each commit against its parent(s) or against the empty tree as needed. 您还知道如何使用
git diff-tree
根据需要将每个提交与其父对象或空树进行比较。 So now you are ready to write your fancy pre-receive hook. 因此,现在您可以编写您喜欢的预接收钩子了。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.