[英]How to find all unmerged commits in master grouped by the branches they were created in?
I have to create some code review from unmerged branches. 我必须从未合并的分支机构创建一些代码审查。
In finding solutions, let's not go to local-branch context problem as this will run on a server; 在寻找解决方案时,我们不要去本地分支上下文问题,因为这将在服务器上运行; there will be just the origin remote, I will always run a git fetch origin command before other commands, and when we talk about branches, we will refer to origin/branch-name .
只有原点远程,我会在其他命令之前运行git fetch origin命令,当我们谈论分支时,我们将引用origin / branch-name 。
If the setup were simple and each branch that originated from master continued on its own way, we could just run: 如果设置很简单,并且每个源自master的分支继续以自己的方式继续,我们可以运行:
git rev-list origin/branch-name --not origin/master --no-merges
for each unmerged branch and add the resulting commits to each review per branch. 对于每个未合并的分支,并将结果提交添加到每个分支的每个审核。
The problem arises when there are merges between 2-3 branches and work is continued on some of them. 当2-3个分支之间存在合并并且其中一些分支继续工作时会出现问题。 As I said, for each branch I want to create code reviews programmatic and I don't want to include a commit in multiple reviews.
正如我所说,对于每个分支,我想创建程序化的代码审查,我不想在多个评论中包含提交。
Mainly the problems reduce on finding the original branch for each commit. 主要是每次提交找到原始分支时出现问题。
Or to put it simpler... finding all unmerged commits grouped by the branch they most probably were created on. 或者更简单一点......找到所有未创建的提交,这些提交按照他们最有可能创建的分支进行分组。
Let's focus on a simple example: 让我们关注一个简单的例子:
* b4 - branch2's head
* | a4 - branch1's head
| * b3
* | merge branch2 into branch1
* |\ | m3 - master's head
| * \| a3
| | |
| | * b2
| * | merge master into branch1
* /| | m2
|/ | * merge branch1 into branch2
| * /| a2
| |/ |
| | * b1
| | /
| |/
| /|
|/ |
| * a1
* / m1
|/
|
* start
and what I want to obtain is: 而我想要获得的是:
The best solution I found so far is to run: 我到目前为止找到的最佳解决方案是运行:
git show-branch --topo-order --topics origin/master origin/branch1 origin/branch2
and parse the result: 并解析结果:
* [master] m3
! [branch1] a4
! [branch2] b4
---
+ [branch2] b4
+ [branch2^] b3
+ [branch1] a4
++ [branch2~2] b2
-- [branch2~3] Merge branch 'branch1' into branch2
++ [branch2~4] b1
+ [branch1~2] a3
+ [branch1~4] a2
++ [branch1~5] a1
*++ [branch2~5] m1
Output interpretation is like this: 输出解释如下:
For point 3. the commit name resolution is starting with a branch name and, from what I see, this branch corresponds to the branch that commits were created on, probably by promoting path reaching by first-parent. 对于第3点,提交名称解析以分支名称开头,从我看到,此分支对应于创建提交的分支,可能是通过促进第一父级到达的路径。
As I'm not interested in merge commits, I'll ignore them. 由于我对合并提交不感兴趣,我会忽略它们。
I'll then parse each branch-path-commit to obtain their hash with rev-parse. 然后我将解析每个branch-path-commit以使用rev-parse获取它们的哈希值。
How can I handle this situation? 我该如何处理这种情况?
The repository could be cloned with --mirror
which creates a bare repository that can be used as a mirror of the original repository and can be updated with git remote update --prune
after which all the tags should be deleted for this feature. 可以使用
--mirror
克隆存储库,该存储库创建一个裸存储库,可以将其用作原始存储库的镜像,并且可以使用git remote update --prune
进行更新,之后应删除此功能的所有标记。
I implement it this way: 我这样实现它:
1. get a list of branches not merged into master 1.获取未合并为master的分支列表
git branch --no-merged master
2. for each branch get a list of revisions on that branch and not in master branch 2.为每个分支获取该分支上的修订列表,而不是主分支中的修订列表
git rev-list branch1 --not master --no-merges
If the list is empty, remove the branch from the list of branches 如果列表为空,则从分支列表中删除分支
3. for each revision, determine the original branch with 3.对于每个修订版,确定原始分支
git name-rev --name-only revisionHash1
and match regex for ^([^\\~\\^]*)([\\~\\^].*)?$
. 并匹配正则表达式为
^([^\\~\\^]*)([\\~\\^].*)?$
。 The first pattern is the branch name, the second is the relative path to the branch. 第一个模式是分支名称,第二个模式是分支的相对路径。
If the branch name found is not equal to the initial branch, remove revision from the list. 如果找到的分支名称不等于初始分支,请从列表中删除修订。
At the end I obtained a list of branches and for each of them a list of commits. 最后,我获得了一个分支列表,并为每个分支提供了一系列提交。
After some more bash research, it can be done all in one line with: 经过一些更多的bash研究,它可以在一行中完成:
git rev-list --all --not master --no-merges | xargs -L1 git name-rev | grep -oE '[0-9a-f]{40}\s[^\~\^]*'
The result is an output in the form 结果是表单中的输出
hash branch
which can be read, parsed, ordered, group or whatever. 可以读取,解析,排序,分组或其他。
If I grasp your problem space, think you can use --sha1-name 如果我掌握了你的问题空间,可以考虑使用--sha1-name
git show-branch --topo-order --topics --sha1-name origin/master origin/branch1 origin/branch2
git show-branch --topo-order --topics --sha1-name origin / master origin / branch1 origin / branch2
to list what you are interested in, then run the commits through git-what-branch 列出你感兴趣的内容,然后通过git-what-branch运行提交
git-what-branch : Discover what branch a commit is on, or how it got to a named branch.
git-what-branch :了解提交的分支,或者它如何到达命名分支。 This is a Perl script from Seth Robertson
这是Seth Robertson的Perl脚本
and format the report to suite your needs? 并格式化报告以满足您的需求?
There is no correct answer to this question because it is underspecified. 这个问题没有正确答案,因为它没有说明。
Git history is simply a directed acyclic graph (DAG), and it's generally impossible to determine semantic relationships between two arbitrary nodes in a DAG unless the nodes are sufficiently labeled. Git历史只是一个有向无环图(DAG),除非节点被充分标记,否则通常不可能确定DAG中两个任意节点之间的语义关系。 Unless you can guarantee that the commit messages in your example graph follow a reliable, machine-parseable pattern, the commits are not sufficiently labeled—it's impossible to automatically identify the commits you are interested in without additional context (eg, guarantees that your developers follow certain best practices).
除非您可以保证示例图中的提交消息遵循可靠的机器可解析模式,否则提交标记不充分 - 如果没有其他上下文,则无法自动识别您感兴趣的提交(例如,保证开发人员遵循某些最佳实践)。
Here's an example of what I mean. 这是我的意思的一个例子。 You say that commit
a1
is associated with branch1
, but this can't be determined with certainty just by looking at the nodes of your example graph. 你说提交
a1
与branch1
相关联,但仅仅通过查看示例图的节点就无法确定。 It's possible that once upon a time your example repository history looked like this: 您的示例存储库历史可能是这样的:
* merge branch1 into branch2 - branch2's head
|\
_|/
/ * b1
| |
| |
_|_/
/ |
| * a1
* / m1
|/
|
* start - master's head
Note that branch1
doesn't even exist yet in the above graph. 请注意,
branch1
在上图中甚至还不存在。 The above graph could have arisen from the following sequence of events: 上图可能来自以下事件序列:
branch2
is created at start
in the shared repository branch2
在共享存储库的start
处创建 a1
on his/her local branch2
branch branch2
分支上创建a1
m1
and b1
on his/her local branch2
branch branch2
分支上创建m1
和b1
branch2
branch to the shared repository, causing the branch2
ref in the shared repository to point to a1
branch2
分支推送到共享存储库,导致共享存储库中的branch2
ref指向a1
branch2
branch to the shared repository, but this fails with a non-fast-forward error ( branch2
currently points to a1
and can't be fast-forwarded to b1
) branch2
分支推送到共享存储库,但是这会因非快进错误而失败( branch2
当前指向a1
且无法快速转发到b1
) git pull
, merging a1
into b1
git pull
,将a1
合并到b1
git commit --amend -m "merge branch1 into branch2"
for some inexplicable reason git commit --amend -m "merge branch1 into branch2"
出于某种莫名其妙的原因 Some time later, user#1 creates branch1
off of a1
and creates a2
, while user#2 fast-forward merges m1
into master
, resulting in the following commit history: 一段时间后,用户#1从
a1
创建branch1
并创建a2
,而用户#2快进将m1
合并到master
,从而产生以下提交历史记录:
* merge a1 into b1 - branch2's head
* |\ a2 - branch1's head
| _|/
|/ * b1
| |
| |
_|_/
/ |
| * a1
* / m1 - master's head
|/
|
* start
Given that this sequence of events is technically possible (although unlikely), how can a human let alone Git tell you which commits "belong" to which branch? 鉴于这一系列事件在技术上是可行的(虽然不太可能),人类怎么能更好地告诉你哪些提交“属于”哪个分支?
If you can guarantee that users don't change merge commit messages (they always accept the Git default), and that Git has never and will never change the default merge commit message format, then the merge commit's commit message can be used as a clue that a1
started off on branch1
. 如果您可以保证用户不更改合并提交消息(他们总是接受Git默认值),并且Git从未且永远不会更改默认的合并提交消息格式,那么合并提交的提交消息可以用作线索
a1
从branch1
开始。 You'll have to write a script to parse the commit messages—there are no simple Git one-liners to do this for you. 你必须编写一个脚本来解析提交消息 - 没有简单的Git单行为你做这个。
Alternatively, if your developers follow best practices (each merge is intentional and is meant to bring in a differently-named branch, resulting in a repository without those stupid merge commits created by git pull
), and you are not interested in the commits from a completed child branch, then the commits you're interested in are on the first-parent path. 或者,如果您的开发人员遵循最佳实践(每个合并都是有意的,并且意味着引入一个不同名称的分支,从而导致存储库没有由
git pull
创建的那些愚蠢的合并提交 ),并且您对来自的提交不感兴趣。完成子分支,然后您感兴趣的提交在第一个父路径上。 If you know which branch is the parent of the branch you are analyzing, you can do the following: 如果您知道哪个分支是您正在分析的分支的父级,则可以执行以下操作:
git rev-list --first-parent --no-merges parent-branch-ref..branch-ref
This command lists the SHA1 identifiers for the commits that are reachable from branch-ref
excluding the commits reachable from parent-branch-ref
and the commits that were merged in from child branches. 此命令列出了可从
branch-ref
访问的提交的SHA1标识符,不包括从parent-branch-ref
可到达的提交以及从子分支合并的提交。
In your example graph above, assuming parent order is determined by your annotations and not by the order of the lines going into a merge commit, git rev-list --first-parent --no-merges master..branch1
would print the SHA1 identifiers for commits a4, a3, a2, and a1 (in that order; use --reverse
if you want the opposite order), and git rev-list --first-parent --no-merges master..branch2
would print the SHA1 identifiers for commits b4, b3, b2, and b1 (again, in that order). 在上面的示例图中,假设父顺序由您的注释确定,而不是由进入合并提交的行的顺序决定,
git rev-list --first-parent --no-merges master..branch1
将打印SHA1提交a4,a3,a2和a1的标识符(按顺序;如果你想要相反的顺序,则使用--reverse
), git rev-list --first-parent --no-merges master..branch2
将打印提交b4,b3,b2和b1的SHA1标识符(同样,按此顺序)。
If your developers do not follow best practices and your branches are littered with those stupid merges created by git pull
(or an equivalent operation), but you have clear parent/child branch relationships, then writing a script to perform the following algorithm may work for you: 如果您的开发人员没有遵循最佳实践,并且您的分支机构充斥着由
git pull
(或等效操作)创建的那些愚蠢的合并,但您有明确的父/子分支关系,那么编写脚本来执行以下算法可能适用于您:
Find all commits reachable from the branch of interest excluding all commits from its parent branch, its parent's parent branch, its parent's parent's branch, etc., and save the results. 查找从感兴趣的分支可到达的所有提交,不包括来自其父分支,其父代的父分支,其父代的父分支等的所有提交,并保存结果。 For example:
例如:
git rev-list master..branch1 >commit-list
Do the same for all child, grandchild, etc. branches of the branch of interest. 为感兴趣的分支的所有子,孙等分支做同样的事情。 For example, assuming
branch2
is considered to be a child of branch1
: 例如,假设
branch2
被认为是branch1
的子branch1
:
git rev-list ^master ^branch1 branch2 >commits-to-filter-out
Filter out the results of step #2 from the results of step #1. 从步骤#1的结果中筛选出步骤#2的结果。 For example:
例如:
grep -Fv -f commits-to-filter-out commit-list
The trouble with this approach is that once a child branch is merged into its parent, those commits are considered to be part of the parent even if development on the child branch continues. 这种方法的问题在于,一旦子分支合并到其父分支中,即使子分支上的开发仍在继续,这些提交也被视为父分支的一部分。 Although this makes sense semantically, it does not produce the result you say you want.
虽然这在语义上是有意义的,但它不会产生您想要的结果。
Here are some best practices to make this particular problem easier to solve in the future. 以下是使这一特定问题在未来更容易解决的一些最佳实践。 Most if not all of these can be enforced via clever use of hooks in the shared repository.
大多数(如果不是全部)可以通过在共享存储库中巧妙使用钩子来强制执行。
git pull
git pull
--no-ff
. --no-ff
合并之前将其重新定位到父分支 。 If it does have children branches, you can still rebase, but please preserve the --no-ff
merges of the children branches (this is trickier than it should be). --no-ff
合并(这比应该的更复杂)。 If all of your developers follow these rules, then a simple: 如果所有开发人员都遵循这些规则,那么简单:
git rev-list --first-parent --no-merges parent-branch..child-branch
is all you need to see the commits that were made on that branch minus the commits made on its children branches. 你需要看到在该分支上进行的提交减去在其子分支上进行的提交。
I would suggest doing it kind of the way you described it. 我建议你按照你所描述的方式进行。 But I would work on the output of
git log --format="%H:%P:%s" ^origin/master origin/branch1 origin/branch2
, so you can do better tree-walking. 但我会处理
git log --format="%H:%P:%s" ^origin/master origin/branch1 origin/branch2
的输出git log --format="%H:%P:%s" ^origin/master origin/branch1 origin/branch2
,这样你就可以做更好的树行走了。
git rev-parse
). git rev-parse
获取他们的SHA)。 Mark every commit with the names of the head you came from and its distance. commit -> known-name
. commit -> known-name
。 Now for each of your commits, you will have a list of distance values (that might be negative) to your branch heads. 现在,对于每个提交,您将获得分支头的距离值列表(可能是负数)。 For each commit, the branch with the least distance is the one the commit was most likely created on.
对于每次提交,距离最小的分支是最有可能创建提交的分支。
If you have time, you might want to walk the whole history and then substract the history of master – that might give slightly better results if your branches have been merged into master before. 如果你有时间,你可能想要遍历整个历史记录然后减去master的历史记录 - 如果你的分支之前已经合并到master中,那么可能会给出更好的结果。
Couldn't resist: Made a python script that does what I described. 无法抗拒:做了一个蟒蛇脚本,完成了我所描述的。 But with one change: with every normal step, the distance is not increased, but decreased.
但是有一个变化:每个正常步骤,距离不会增加,而是会减少。 This has the effect that branches that lived longer after a merge-point are preferred, which I personally like more.
这样的结果是合并点之后的分支更长,我个人更喜欢这种分支。 Here it is: https://gist.github.com/Chronial/5275577
这是: https : //gist.github.com/Chronial/5275577
Usage: simply run git-annotate-log.py ^origin/master origin/branch1 origin/branch2
check the quality of the results (will output a git log tree with annotations). 用法:只需运行
git-annotate-log.py ^origin/master origin/branch1 origin/branch2
检查结果的质量(将输出带注释的git日志树)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.