简体   繁体   English

文件的 Git 提交历史:显示正确的提交

[英]Git commit history of a file: showing the right commits

I do have the following alias, to show me the commit history of any given file:我确实有以下别名,以向我显示任何给定文件的提交历史:

file-history = log --follow --date-order --date=short -C

It works well, but never shows "merge commits", while the file can have been modified in a branch we did merge into main, for example.它运行良好,但从不显示“合并提交”,例如,文件可能已在我们合并到 main 的分支中被修改。

在此处输入图片说明

The solution is to add the option -m , but then it shows many, many, many merge commits, for which most of them seem unrelated to the commit history of the file.解决方案是添加选项-m ,但随后它会显示许多、许多、许多合并提交,其中大多数似乎与文件的提交历史无关。

What is the right way to write such an alias to make it behave correctly (like in BitBucket, for this matter): showing all commits that did change a file, and only those?编写这样的别名以使其行为正确的正确方法是什么(例如在 BitBucket 中,就此而言):显示确实更改文件的所有提交,并且仅显示那些提交?

EXTRA INFORMATION --额外的信息 -

Using -m shows way too many commits;使用-m显示太多提交; concretely:具体来说:

在此处输入图片说明

(In red rectangles, what I should see... that's what BitBucket displays...) (在红色矩形中,我应该看到什么......这就是 BitBucket 显示的......)

(BTW, I don't understand why the commit da3c94a1 is duplicated.) (顺便说一句,我不明白为什么提交 da3c94a1 是重复的。)

Using -c shows even much more commits (the first commit that should be reported being in the bottom of the page) and displays the diffs (what I don't want to see here):使用-c 会显示更多提交(应该报告的第一个提交位于页面底部)并显示差异(我不想在这里看到的):

在此处输入图片说明

Same results for --cc : --cc 的相同结果:

在此处输入图片说明

And --first-parent shows weird results (as I don't see at all the commits I'm interested in):并且--first-parent显示出奇怪的结果(因为我根本没有看到我感兴趣的提交):

在此处输入图片说明

NEW EXTRA INFORMATION --新的额外信息——

And, with --first-parent -m , no change:而且,使用--first-parent -m ,没有变化:

在此处输入图片说明

but never shows "merge commits", while the file can have been modified in a branch we did merge into main, for example但从不显示“合并提交”,而文件可能已在我们合并到主的分支中被修改,例如

If you're going to do this, add --first-parent -m (as I see @torek suggests in the comments).如果您打算这样做,请添加--first-parent -m (正如我在评论中看到的@torek 所建议的那样)。 Not just -m , which by itself is more of a forensic tool for desperation cases only.不仅仅是-m ,它本身更像是一种仅适用于绝望情况的取证工具。

What's going on here is, without --first-parent Git's already going to show you those changes, in the commit(s) that made them.这里发生的事情是,如果没有--first-parent Git 已经会在做出这些更改的提交中向您展示这些更改。 The merge isn't introducing any new changes.合并不会引入任何新的变化。 If Git showed you the merge diffs, it'd wind up showing you everything at least twice.如果 Git 向您展示合并差异,它最终会向您展示所有内容至少两次。

This is why it's such a good idea to avoid introducing new work or corrections in a merge commit.这就是为什么避免在合并提交中引入新工作或更正是一个好主意的原因。 The act makes it impossible to reason about what the merge does.该行为使得无法推断合并的作用。 Merge conflict detection and resolution is already unbounded.合并冲突检测和解决已经是无限的。 Git does as well as any vcs can, and better than I think every other vcs on the planet, but it can't ever be perfect. Git 和任何 vcs 一样好,而且比我认为的地球上所有其他 vcs 都要好,但它永远不可能是完美的。 Say a value in a list has to have some mathematical relation to a sum of the remaining values, the changes on both branches preserved that relation but when combined, that relation is broken.假设列表中的一个值必须与剩余值的总和有某种数学关系,两个分支上的变化都保留了这种关系,但是当组合起来时,这种关系就被破坏了。 Or any other such condition: code is added in one branch that depends on an existing compiled-in value, but the other branch makes it user-configurable.或任何其他此类条件:将代码添加到依赖于现有编译值的一个分支中,但另一个分支使其可由用户配置。 Think about that one for a while.想一想那个。

So --first-parent -m -p will show you the diffs even for merges, but only changes introduced by the merged branch and conflict resolutions, work done earlier in the mainline will show up in the commits that introduced it.因此,-- --first-parent -m -p甚至会向您显示合并的差异,但只有合并分支和冲突解决方案引入的更改,在主线中较早完成的工作将显示在引入它的提交中。

*You asked specifically about looking for merge commits in your output. *您专门询问了在输出中查找合并提交的问题。 I think now, based on all the comments under the question, this was a mistake: you don't want the merge commits at all , even if they do change the file in question.我觉得现在的基础上,根据问题的所有评论,这是一个错误:你不希望合并提交可言,即使他们这样做改变有问题的文件。 What you want is to stop git log from performing History Simplification .您想要的是阻止git log执行 History Simplification

To do that, simply provide the --full-history flag to git log .为此,只需向git log提供--full-history标志。 But it's also important to know what this flag means: in particular, I don't think you understand what Git is trying to show you here (which is not surprising, as Git documentation does a terrible job of explaining what Git is trying to do in the first place).但了解这个标志的含义也很重要特别是,我认为你不明白 Git 在这里试图向你展示什么(这并不奇怪,因为 Git 文档在解释 Git 试图做什么方面做得很糟糕首先)。

To get to the aha!啊哈! moment , we have to start with a simple review of stuff probably already know but may have shoved into the back of your mind and forgotten about:片刻,我们必须从简单回顾一下可能已经知道但可能已经塞进你的脑海并忘记的东西开始:

  • Git is all about commits , and each commit is a numbered entity, found by its big ugly random-looking hash ID; Git 是关于commits 的,每个 commit 都是一个编号的实体,通过其丑陋的随机哈希 ID 找到;
  • each commit stores a snapshot and some metadata, and the metadata include the raw hash ID of some set of earlier commits;每个提交存储一个快照和一些元数据,元数据包括一些早期提交的原始哈希 ID; and
  • most commits store just one previous commit hash ID.大多数提交只存储一个以前的提交哈希 ID。

This makes commits form simple backwards-looking chains.这使得提交形成简单的向后看的链。 Let's use simple uppercase letters as pretend hash IDs, and allocate them sequentially to make things easy for our puny human brains, and imagine we have a repository that ends with a commit with hash ID H , like this:让我们使用简单的大写字母作为假装的哈希 ID,并按顺序分配它们以使我们微不足道的人脑更容易,并假设我们有一个以哈希 ID H提交结束的存储库,如下所示:

A <-B <-C ... <-F <-G <-H

That is, the last —and therefore latest—commit in this repository is commit H .也就是说,此存储库中的最后一次(因此也是最新的)提交是提交H Commit H stores both a full snapshot of every file and a backwards-pointing arrow (really, the true commit hash ID of) earlier commit G .提交H存储每个文件的完整快照一个向后箭头(实际上,真正的提交哈希 ID)早期提交G

Using the stored snapshot in G and the stored snapshot in H , Git can compare the two snapshots.使用在所存储的快照G在所存储的快照H ,GIT中可以比较两个快照。 Whatever is different here, those are the files we changed;不管这里有什么不同,那些是我们改变的文件; by comparing those files, Git can produce a diff, showing the particular lines we changed, or Git can just make a list of the files that we changed.通过比较这些文件,Git 可以生成一个差异,显示我们更改的特定,或者 Git 可以列出我们更改的文件。 That's pretty straightforward, but it does mean that to know what changed in H , Git must extract both snapshots: the one from H , but also the one from its parent G .这很简单,但这确实意味着要知道H发生了什么变化,Git 必须提取两个快照:一个来自H ,另一个来自其父G

The git log command will do this for H , then move back one step to G . git log命令将对H执行此操作,然后后退一步到G Now, to see what changed in G , Git must compare the snapshot of its parent F to the snapshot in G .现在,要查看G发生了什么变化,Git 必须将其父F的快照与G的快照进行比较。 That suffices for knowing what changed in G .这足以知道G发生了什么变化。

Now git log can step backwards yet again.现在git log可以再次倒退。 This repeats as needed, until we have run all the way back to the very first commit, which—by definition—simply adds all the files it has in its snapshot.这会根据需要重复,直到我们一直运行到第一次提交,根据定义,它只是它拥有的所有文件添加到其快照中。 There's nothing before the root commit A , so everything is new, and now git log can stop.根提交A之前什么都没有,所以一切都是新的,现在git log可以停止了。

Merges mess with this合并混乱

That works fine for these simple linear chains, but Git's commits are not always simple linear chains .这对于这些简单的线性链很有效,但是 Git 的提交并不总是简单的线性链 Suppose we have our simple-so-far repository, where there is only one branch named main and it ends at H , but now we make some new branch names, make some commits on these new branches, and get ready to merge them:假设我们有一个简单的存储库,其中只有一个名为main分支并以H结尾,但现在我们创建一些新的分支名称,这些新分支进行一些提交,并准备合并它们:

          I--J   <-- br1
         /
...--G--H
         \
          K--L   <-- br2

Commits up through H are on all branches, while commits IJ are only on br1 and commits KL are only on br2 .通过H提交在所有分支上,而提交IJ仅在br1 ,提交KL仅在br2 Using git log at this point shows us J , then I , then H , then G , etc., following the arrows backwards from br1 's latest commit;在这一点上使用git log向我们显示J ,然后是I ,然后是H ,然后是G ,等等,从br1的最新提交向后显示; or, it shows us L , then K , then H , then G , etc., following the arrows backwards from br2 's latest commit.或者,它向我们显示L ,然后是K ,然后是H ,然后是G等等,从br2的最新提交开始向后br2

Git will of course find file "changes" in the usual way: compare the snapshot in L vs that in K , or K vs H , etc. Since every commit has exactly one parent commit, this works fine. Git 当然会以通常的方式找到文件“更改”:比较L的快照与K的快照,或KH的快照等。由于每个提交都只有一个父提交,因此可以正常工作。

Once we merge , however, we have a problem.但是,一旦我们合并,我们就会遇到问题。 The merge itself works by:合并本身通过以下方式工作:

  • comparing H vs J to see what changed on br1 ;比较HJ以查看br1变化;
  • comparing H vs L to see what changed on br2 ;比较HL以查看br2上的br2 and
  • combining these changes, and applying the combined changes to the snapshot in H .组合这些更改,并将组合的更改应用到H的快照。

This keeps "our" changes on br1 and adds "their" changes on br2 , if that's the direction we're doing the merge.这会在br1上保留“我们的”更改并在br2上添加“他们的”更改,如果这是我们进行合并的方向。 Or, it keeps "our" changes on br2 and adds "their" changes on br1 .或者,它在br2上保留“我们的”更改并在br1上添加“他们的”更改。 Either way the result is the same (except for conflict resolutions, if any, which depend on how we choose to resolve the conflict).无论哪种方式,结果都是相同的(除了冲突解决方案,如果有的话,这取决于我们选择如何解决冲突)。

We now have Git make a new merge commit , M , which has:我们现在让 Git 进行一个新的合并提交M ,它具有:

  • one snapshot, but一张快照,但
  • two parents.两个父母。

It looks like this:它看起来像这样:

          I--J
         /    \
...--G--H      M
         \    /
          K--L

I have taken the labels away because at this point we often do that: M is now the latest main commit instead, and when we add another new commit N it just extends main :我已经去掉了标签,因为在这一点上我们经常这样做: M现在是最新的main提交,当我们添加另一个新提交N它只是扩展了main

          I--J
         /    \
...--G--H      M--N
         \    /
          K--L

N is an ordinary single parent commit as usual, so the niceness of comparing the snapshot in M vs that in N works as usual, finding the changes as usual. N像往常一样是一个普通的单亲提交,所以比较M的快照与N的快照的好处像往常一样工作,像往常一样找到变化。

Merge commit M , on the other hand, is quite thorny.另一方面,合并提交M非常棘手。 How should git log show the changes? git log应该如何显示更改? Changes , in Git, require that we look at "the" parent commit. Git 中的Changes要求我们查看“该”父提交。 But M does not have the parent.M没有 M has two parents, J and L . M两个父母, JL Which one should we use?我们应该使用哪一种?

The -m flag means run two separate git diff operations , one against J , and then a second one against L . -m标志意味着运行两个单独的git diff操作,一个针对J ,然后第二个针对L That way we'll see what changed vs J , ie, what we brought in via KL , and then we'll also see what changed vs L , ie, what we brought in via IJ .这样我们将看到与J发生了什么变化,即我们通过KL带来了什么,然后我们还将看到与L发生了什么变化,即我们通过IJ带来了什么。

Adding --first-parent means follow just one of these lines so that at M we'll see, eg, what happened in KL , but then we won't look at K or L at all any more .添加--first-parent意味着只遵循这些行中的一行,以便在M我们将看到,例如,在KL发生了什么,但是我们将不再查看KL We'll just move back to J .我们将回到J The effect is that Git pretends , for the duration of -m --first-parent , that the commit graph looks like this:效果是 Git 在-m --first-parent期间假装提交图如下所示:

...--G--H--I--J--M--N

This is, more or less, literally what you asked for—but it's not what Bitbucket is doing.这或多或少就是你所要求的——但这不是 Bitbucket 正在做的。

Undoing the merge mess several other ways以其他几种方式撤消合并混乱

We can , if we so choose, have git log compare M vs both J and L —ie, make two separate git diff s—but then discard most of the results of these two diffs .如果我们愿意,我们可以git log比较MJL ,制作两个单独的git diff — 但然后丢弃这两个 diff 的大部分结果 Git has two different "combined diff" modes, which you can get with -c or --cc . Git 有两种不同的“组合差异”模式,您可以使用-c--cc

Unfortunately, neither one does what you want.不幸的是,没有人做你想要的。 They're also rather difficult to explain (and I still don't really know what the true difference between the two is, though they are demonstrably different: I can show some differences, but I don't know what the goals are, of the two different options).它们也很难解释(我仍然不知道两者之间的真正区别是什么,尽管它们明显不同:我可以表现出一些差异,但我不知道目标是什么,两个不同的选项)。

History Simplification历史简化

The real key here though is this.不过,这里真正的关键是这个。 Suppose there is some file F that appears in all three commits M , J , and L .假设有一些文件F出现在所有三个提交MJL Remember, this particular snippet of our picture looks like this:请记住,我们图片的这个特定片段如下所示:

       I--J
      /    \
...--H      M
      \    /
       K--L
  • If F is the same in all three commits, it's not "interesting" in this merge.如果F所有三个提交中相同,则在这次合并中它不是“有趣的”。 Nobody made any changes to it.没有人对其进行任何更改。
  • If F matches in J vs M , but is different in L vs M , then "something interesting" happened.如果FJ vs M匹配,但在L vs M不同,那么“有趣的事情”发生了。 The same is true if F matches in L vs M , but is different in J vs M .如果FLM匹配,则情况相同,但在JM不同。

What git log does in most cases here is to try to find out about the final state of the file.大多数情况下, git log所做的是尝试找出文件的最终状态 Why does file F look the way it does in M ?为什么文件F看起来像它在M的样子? But think about this: If F differs in J vs M but matches in L vs M , then anything we did to the file along the top row is irrelevant!但是想想看:如果FJM不同,但在LM匹配,那么我们对顶行文件所做的任何事情都无关紧要! We threw away the top-row copy of file F and kept only the bottom-row copy.我们扔掉了文件F的顶行副本,只保留了底行副本。

So, if you're asking git log about file F at this point, git log simply does not bother to look at commits IJ .因此,如果此时您向git log询问文件Fgit log根本不会费心查看提交IJ It follows only the bottom row.它只跟随行。

On the other hand, if F exactly matches in J -vs- M but differs in L -vs- M , git log -- F will follow only the top row, because we threw away anything that came out of the bottom row.另一方面,如果FJ -vs- M完全匹配但在L -vs- M不同,则git log -- F将仅跟随行,因为我们丢弃了从行出来的任何东西。

This is History Simplification in a nutshell.简而言之,这就是历史简化 The git log command will, at merge points, throw out one "side" of the merge entirely if it can.如果可以, git log命令将在合并点完全丢弃合并的“一侧” If the file(s) we care about match one side, that's the side git log will pick.如果我们关心的文件匹配一侧,那就是git log将选择的一侧。 If the file(s) we care about match all sides of the merge, git log will pick one side at random, and follow that side.如果我们关心的文件匹配合并的所有边, git log将随机选择一侧,然后跟随那一侧。

This means git log never even looks at any of the files on the "other side" of the merge so you will not see any of those commits in the git log output.这意味着git log甚至从不查看合并“另一侧”的任何文件,因此您不会在git log输出中看到任何这些提交。 The program is assuming that since the merge took "one side" over the other, that's the interesting side, and everything that might show up on the other is irrelevant dross, to be discarded.该程序假设由于合并将“一侧”置于另一侧,这是有趣的一侧,并且可能出现在另一侧的所有东西都是无关紧要的渣滓,将被丢弃。

This is sometimes what you want这有时是你想要的

The reason git log does this kind of history simplification is that it assumes your goal is to know why the file looks the way it does in the latest version. git log这种历史简化的原因是它假设您的目标是知道为什么文件在最新版本中看起来如此。 Any irrelevant-dross-commits that got throw out don't matter, so let's not even look at them.任何被丢弃的不相关的渣滓提交都无关紧要,所以我们甚至不要看它们。

When that's what you want, that's what you want!当那是你想要的,这就是你想要的! But sometimes you want to know: I'm sure I changed this myself, where was that?但有时你想知道:我确定我自己改变了这个,那在哪里? or something similar.或类似的东西。 Here, you must tell git log not to do history simplification at all .在这里,你必须告诉git log根本不要做历史简化 The flag for this is --full-history .这个标志是--full-history There are other history simplification flags, so that you can control the simplification: it is useful after all.还有其他历史简化标志,以便您可以控制简化:毕竟它很有用。 Read through the git log documentation History Simplification section to see them.通读git log文档历史简化部分以查看它们。

It's worth adding one more point here, having to do with so-called evil merges (see Evil merges in git? ).值得在这里再补充一点,与所谓的邪恶合并有关(参见git 中的邪恶合并? )。 We could have:我们可以有:

...--J
      \
       M
      /
...--L

where the snapshot files in M are utterly unrelated to the files in either J or L .其中快照文件M完全无关任一文件JL More commonly, we might have some file in M that has a change hidden away in it that does not come from the top or bottom rows at all, but rather was produced due to the fact that there were conflicts and/or the combined changes didn't work .更常见的是,我们可能在M中有一些文件,其中隐藏了一个更改,这些更改根本不是来自顶行或底行,而是由于存在冲突和/或组合更改而产生的不行

If the "hidden" change is due to conflicts, that's not so bad, but if someone stuck in an unrelated fix, we have an issue.如果“隐藏”更改是由于冲突引起的,那还不错,但是如果有人陷入了不相关的修复,我们就会遇到问题。 In particular, git log by default does not display merge commits at all when using git log -- path .特别是,在使用git log -- path时,默认情况下git log根本不显示合并提交 It assumes that anything interesting that will show up from the path argument will be found on either the top or bottom row, in a commit before the merge.假设在合并之前的提交中,将在顶部或底部行中找到将从path参数中显示的任何有趣内容。 But an "evil merge" might introduce an "interesting change" that isn't in either row, and this is when you must force git log to look at merge commits, using -m , or -c , or --cc .但是“邪恶的合并”可能会引入一个不在任何一行中的“有趣的变化”,此时您必须强制git log查看合并提交,使用-m-c--cc

What Bitbucket do with their software is of course up to them. Bitbucket 用他们的软件做什么当然取决于他们。 We don't know if they are currently using --full-history --cc for instance.例如,我们不知道他们目前是否正在使用--full-history --cc We don't know whether, in future, they might change the internal git log options.我们不知道将来他们是否会更改内部git log选项。 So there's no real point in trying to make your command-line git log output exactly match your Bitbucket view output, as the latter is not under your control in the first place.因此,尝试使您的命令行git log输出与您的 Bitbucket 视图输出完全匹配并没有实际意义,因为后者首先不受您的控制。 If you are going to use git log , then, concentrate instead on knowing what git log is doing and how to make that work to your advantage.如果您打算使用git log ,那么,请专注于了解git log正在做什么以及如何使其工作对您有利。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM