简体   繁体   English

Git:壁球做什么? 为什么樱桃选择较旧的提交?

[英]Git: What does squash do? and Why cherry-pick an older commit?

The latest commit should be the one that incrementally contains all changes of the previous commits,right ? 最新的提交应该是逐步包含先前提交的所有更改的提交,对吧?

If that is so, what use is -squash for, to just compress all commits into one for cleaning up history? 如果是这样,那么-squash用于将所有提交压缩为一个用于清理历史记录的用途是什么?

What squash actually does is just deleting all commits except the latest one? 什么壁球实际上只是删除除最新提交之外的所有提交?

The same goes for cherry-picking too. 樱桃采摘也是如此。 If the latest commit contains all changes of the previous commits then why would you need to pick up an older commit? 如果最新提交包含先前提交的所有更改,那么为什么需要选择较旧的提交?

I'm not sure whether you're working under a misconception here. 我不确定你是否在这里误解了。 A lot of older source code management systems store each commit (or check-in or whatever they call it) as a "change". 许多较旧的源代码管理系统将每个提交(或签入或他们称之为的任何内容)存储为“更改”。 Git doesn't: a commit has each file fully intact, 1 stored as a repository object of type "blob". Git没有:提交使每个文件完整无缺, 1存储为“blob”类型的存储库对象。

The "blob" objects sit at the bottom level, as it were, pointed-to by "tree" objects (which hold the names, and executable-bits, of files/directories), and the trees are pointed-to by "commit" objects. “blob”对象位于底层,因为它是由“树”对象(其中包含文件/目录的名称和可执行位)指向的,并且树由“提交”指向“对象。 (There's one last repository-level object type, the "annotated tag", which normally points to a commit.) So given a commit SHA-1 and a repository path like dir/file , git starts by extracting the commit object, which leads to a tree that needs to have an entry for dir . (有一个最后的存储库级对象类型,“带注释的标记”,通常指向提交。)因此,给定提交SHA-1和像dir/file这样的存储库路径,git首先提取提交对象,到需要有dir条目的树。 That entry needs to lead to another tree, which must then have an entry for file , and then that entry should be a blob, and that's the version of dir/file that appears in that commit. 该条目需要导致另一个树,然后必须有一个file条目,然后条目应该是一个blob,这是该提交中出现的dir/file的版本。

Branch and tag names, like master , are just human-readable words giving the "true name" SHA-1 of the underlying repository object. 分支和标记名称(如master )只是人类可读的单词,给出了底层存储库对象的“真实名称”SHA-1。 Commit objects have "parent SHA-1" values in them, allowing git to extract a commit-graph dynamically. Commit对象中包含“父SHA-1”值,允许git动态提取提交图。

You can, of course, still get change-sets out of git. 当然,你仍然可以通过git获得变更集。 It just computes them dynamically, every time. 它每次都只是动态地计算它们。

Suppose we have this graph of commits, where I use one uppercase letter to represent each 40-character SHA-1, and some branch-names: 假设我们有这个提交图,我使用一个大写字母来表示每个40个字符的SHA-1,以及一些分支名称:

A - B - C - D      <-- master
      \
        E - F      <-- branch

The name master just names commit D (which might really be dcfaa9d9767a010c143ffd42b01b84d2abb4cffc ). 名称master只是命名提交D (可能真的是dcfaa9d9767a010c143ffd42b01b84d2abb4cffc )。 That commit has one parent, C (really 222c4dd... ). 那个提交有一个父, C (真的是222c4dd... )。 C has one parent B , B has one parent A , and A has no parents at all—it's a root commit. C有一个父BB有一个父AA根本没有父 - 它是根提交。

Commit F cannot be reached by starting at commit D and working backwards through any parents; 从提交D开始并通过任何父母向后工作无法达到提交F ; it is only reachable by its branch-name branch . 它只能通过其分支名称branch F 's parent is E , and E 's (single) parent commit is B , though, so starting from F we can work backwards through B to A . F的父母是EE的(单个)父母提交是B ,所以从F开始我们可以向后通过BA

This is where regular merges come in: they operate on the commit graph. 这是常规合并的来源:它们在提交图上运行。 If we "merge branch into master "—let's do this temporarily, then back it out again: 如果我们“将branch合并到master ”-let's暂时执行此操作,然后再将其退出:

$ git checkout master
Switched to branch 'master'
$ git merge -m regular-merge branch
[snip]

—we make one new merge commit M : - 我们做了一个新的合并提交M

A - B - C - D - M  <-- master
      \       /
        E - F      <-- branch

This (real, non-"squash") merge has two parents, D and F . 这个(真正的,非“壁球”)合并有两个父母, DF (And, importantly, D is the "first" parent: this tells you which commit was originally "on master " before the merge.) So now the two commits that used to be reachable only via branch (starting at F and working backwards), can be found on master too. (并且,重要的是, D是“第一个”父级:这会告诉您在合并之前哪个提交最初是“在master ”。)所以现在两个提交过去只能通过branch (从F开始并向后工作) ,也可以在master身上找到。

What about the various file contents? 各种文件内容怎么样? Well, that's up to whoever does the merge. 嗯,这取决于谁合并。 Using git's automatic merging, if you "merge branch into master " and there are no conflicting changes, you'll get what you'd expect. 使用git的自动合并,如果你“将branch合并master ”并且没有相互冲突的变化,你将得到你所期望的。 However, you could merge with -s ours to discard the contents of the changes in E and F . 但是,您可以与-s ours合并以丢弃EF更改的内容 You'll still get a merge commit M , but its tree will be identical to the tree in commit D . 您仍将获得合并提交M ,但其树将与提交D的树相同。 2 2

At any time you can also ask git to produce the change-set from one commit to another. 在任何时候你也可以要求git从一次提交到另一次提交产生变更集。 So if you want to see what changed between B and E , you could find the SHA-1 for both and do: 因此,如果您想查看BE之间的变化,您可以找到两者的SHA-1并执行:

git diff <sha1-for-B> <sha1-for-E>

To see what changed between E and F , you can simply use their SHA-1s, and to see "what happened on branch branch ", use the SHA-1s for B and F . 要查看EF之间的变化,您可以简单地使用它们的 SHA-1,并查看“分支branch上发生了什么”,将SHA-1用于BF

As a very convenient convenience, to see what happened between some commit and its parent—let's not worry about merges here since they have multiple parents—we can just use, eg: 作为一个非常方便的方便,看看一些提交和它的父之间发生了什么 - 让我们不要担心这里的合并,因为他们有多个父母 - 我们可以使用,例如:

git show <sha1-for-F>

The git show command will (among other things) find F 's parent and run a diff between E and F , to show us what the changes there were. git show命令将(除其他外)查找F的父级并在EF之间运行差异,以向我们显示其中的更改。

Instead of writing out the full (or partial) SHA-1, we can just use the branch name: 我们可以只使用分支名称,而不是写出完整(或部分)SHA-1:

git show branch

In general, if a raw SHA-1 will work, so will a branch name (but not always vice versa). 通常,如果原始SHA-1起作用,分支名称也将起作用(但反之亦然)。

Naturally, since the SHA-1s are unweildy, there are lots more ways to name these things; 当然,由于SHA-1不合适,所以有很多方法可以命名这些东西; but let's just ignore that for now and finally get to "squash merges" and "cherry pick". 但是,让我们暂时忽略它,最后得到“壁球合并”和“樱桃选择”。

Let's "un-do" the merge into master so that master and branch are separate once again: 3 让我们“取消”合并master以便masterbranch再次分开: 3

$ git status   # just checking! "git reset --hard" could lose work
# On branch master
nothing to commit, working directory clean
$ git reset --hard master^
[snip]

We're now back to this commit graph: 我们现在回到这个提交图:

A - B - C - D      <-- master
      \
        E - F      <-- branch

Let's say that change F fixes a nasty bug and we want a copy of it in master . 让我们说改变F修复了一个令人讨厌的错误,我们想要它在master的副本。 We tried to take it in by using git merge branch , but that brought in change E too, which is not ready yet. 我们试图通过使用git merge branch来实现它,但是这也带来了变化E ,但还没有准备好。

So, now we just "cherry pick" commit F : 所以,现在我们只是“樱桃挑选”提交F

$ git cherry-pick branch
[snip]

As usual we can use the branch name to identify the (single) commit at the tip of the branch. 像往常一样,我们可以使用分支名称来标识分支顶端的(单个)提交。

This tells git to gather up the changes between E and F , just like git show would. 这告诉git收集EF之间的变化,就像git show一样。 Instead of showing them to us, though, git tries to patch those changes into the current ( HEAD ) commit. 但是,git没有向我们展示它们,而是尝试将这些更改修补到当前( HEAD )提交中。 Since we're on branch master, the HEAD commit is commit D . 由于我们在分支主机上,因此HEAD提交是提交D So this extracts the changes from E to F , applies them to D , and if successful, makes a new commit. 因此,这将从E更改为F ,将它们应用于D ,如果成功,则进行新的提交。 Let's call it P (for Pick): 我们称之为P (选择):

A - B - C - D - P  <-- master
      \
        E - F      <-- branch

Here the contents of P may be quite different from the contents of F , but the change (from D to P ) is the same as the change from E to F . 这里P内容可能与F内容完全不同,但是变化 (从DP )与从EF变化相同。 The diff output of git show master and git show branch should be very similar—the line numbers might change a bit (or even a lot) but the changes shown should be the same. git show mastergit show branch的diff输出应该非常相似 - 行号可能会改变一点(甚至很多),但显示的更改应该是相同的。

Let's toss out P the same way we tossed out the merge M earlier. 让我们抛出P的方式与我们之前抛出合并M方式相同。 4 Note that we're still on branch master here, and it's still clean (nothing going on, nothing to commit, even though I'm not bothering with git status this time): 4请注意,我们仍然在这里的分支master ,它仍然是干净的(没有任何进展,没有任何提交,即使我这次没有打扰git status ):

$ git reset --hard master^
[snip]

This time, let's do a "squash-merge" of branch . 这一次,让我们做branch的“挤压合并”。

The action of this git merge is very similar to a regular merge, except that instead of merge-commit M with two parents, it will set up a "squash merge" commit, let's call it S , with only one parent. 这个git merge的动作非常类似于常规合并,除了不是使用两个父项的merge-commit M ,它将设置一个“squash merge”提交,让我们称之为S ,只有一个父。 It doesn't actually do the commit (squash implies --no-commit ) so we have to do the commit part explicitly: 它实际上没有进行提交(squash暗示--no-commit )所以我们必须明确地执行提交部分:

$ git merge --squash branch
Squash commit -- not updating HEAD
Automatic merge went well; stopped before committing as requested
$ git commit -m squash-merge
[snip]

Now we have this: 现在我们有了这个:

A - B - C - D - S  <-- master
      \
        E - F      <-- branch

The tree for commit S —the set of all files—will be the same as the tree you'd get with a regular merge. 提交S - 所有文件的集合 - 将与您通过常规合并获得的树相同。 In this case, that would be the equivalent of applying, as a single patch, the git diff between B and F , to the tree-contents of commit D . 在这种情况下,这相当于将BF之间的git diff作为单个补丁应用于commit D的树内容。 5

In other words, S has "the changes between B and E plus the changes between E and F ", applied to D . 换句话说, S具有“ BE之间的变化加上EF之间的变化”,应用于D But it has only one single parent. 但它只有一个单亲。 It's this commit-graph difference that makes it a "squash merge" rather than a regular merge. 正是这种提交图差异使得它成为“壁球合并”而不是常规合并。

Of course, if the regular merge didn't work—commit E was not ready for master —the squash merge won't work either. 当然,如果经常合并没有成功提交E还没有准备好master -the壁球合并不会工作。 So here cherry-pick is the sensible option. 所以樱桃挑选是明智的选择。

Important aside: note how each time we did something to branch master that added a new commit—the merge M , the cherry pick P , or the squash-commit S —the branch master automatically "moved forward" to point to the newest commit. 重要的是:注意每次我们为分支master做了什么,添加了新的提交 - 合并M ,樱桃选择P ,或者壁球提交S分支master自动“向前移动”以指向最新的提交。 That's what distinguishes a branch (or a "local branch") from other labels, in git. 这就是git中将分支(或“本地分支”)与其他标签区分开来的原因。 A branch name is just a commit-ID that automatically moves as you add new commits. 分支名称只是一个提交ID,在您添加新提交时会自动移动 Tag names work exactly like branch names except that they don't move. 标记名称的工作方式与分支名称完全相同,只是它们不移动。


1 Well, blobs have them intact, but compressed (with deflate compression), as long as they are "loose objects". 1好吧,斑点使它们完好无损,但压缩(压缩压缩),只要它们是“松散的物体”。 Eventually loose objects are "packed" to save even more space, and packs can then be "delta-compressed", giving all the space savings available in the older delta-based SCMs—and actually more, because any one object can be compressed against any other object, at least in theory. 最终松散的对象被“打包”以节省更多的空间,然后可以对包进行“增量压缩”,从而在旧的基于delta的SCM中节省所有空间 - 实际上更多,因为任何一个对象都可以被压缩任何其他对象,至少在理论上是这样。 File foo does not have to compressed only against "previous version of file foo". 文件foo不必仅针对“以前版本的文件foo”进行压缩。

2 This is mostly meant as a way to document the "killing off" of a branch, as an alternative to simply abandoning it. 2这主要是为了记录分支的“杀戮”,作为简单地放弃分支的替代方法。

3 The reset --hard does two things: it modifies the working directory back to the state it had at commit D , and, it changes the branch label master to point back to commit D again. 3 reset --hard执行两项操作:它将工作目录修改回它在提交D的状态,并且它更改分支标签master以指回再次提交D The simple ^ here suffix tells git to follow the "first parent". 简单的^这里后缀告诉git遵循“第一个父”。 The other main syntax, "going back N commits"—eg, master~3 goes back 3—also follows "first parents", so from merge commit M , master~3 would count back to D , then C , then B . 另一个主要语法,“返回N提交”-eg, master~3返回3 - 也跟随“first parents”,所以从merge commit Mmaster~3将重新计入D ,然后是C ,然后是B Once the reset has taken effect, though, master names commit D again, so going back 3 goes to C , then B , then A instead. 一旦重置生效, master名称再次提交D ,所以返回3进入C ,然后是B ,然后是A

4 Incidentally, you might—in fact, you should —wonder: what happens to these commits we're casually "tossing out"? 4顺便说一下,你可能 - 事实上,你应该 -思考:这些提交会发生什么,我们随便“扔掉”? The answer is: they live on in the repo, labeled through the "reflog", until the reflog entries time out. 答案是:他们住在回购中,通过“reflog”标记,直到reflog条目超时。 By default, a reflog entry "expires" after 90 days if it's "reachable"—defining this gets a bit too technical for this footnote that's already too long—and 30 days if it's "unreachable". 默认情况下,reflog条目在90天之后“过期”,如果它“可以到达” - 定义这个脚本太过技术性已经太长了 - 如果它“无法访问”则为30天。 These are "unreachable", so they expire in about a month. 这些是“无法到达的”,因此它们将在大约一个月后到期。 After that, the commits, and any trees and blobs that only these tossed-out commits use, are garbage-collected on the next git gc . 在那之后,提交,以及只有这些被抛出的提交使用的任何树和blob,在下一个git gc上被垃圾收集。

5 This assumes you don't do something weird like apply the ours strategy to the squash-merge, but that would be useless. 5这假设你没有做一些奇怪的事情,比如将ours策略应用于壁球合并,但那将毫无用处。 (Also, unlike the not-part-of-git patch command, git is smart when doing merges and cherry-picks and such, in that it can usually tell if you already have some particular patch applied, and not try to apply it twice.) (另外,与not-part-of-git patch命令不同,git在进行合并和挑选时很聪明,因为它通常可以判断你是否已经应用了某个特定补丁,而不是尝试两次应用它。)

Yes, squashing will collapse commit history. 是的,挤压会破坏提交历史。 You might do this if, for example, you have been working on a feature on a separate branch, and you don't want to pollute a mainline branch with hundreds of commits. 例如,如果您一直在单独的分支上处理某个功能,并且您不希望污染具有数百次提交的主线分支,则可以执行此操作。 On the other hand, you'd probably only want to squash a branch you're closing, since you lose the tracking benefit of a non-squashed merge. 另一方面,您可能只想压缩正在关闭的分支,因为您失去了非压缩合并的跟踪优势。 (you can look at the tree and see which commit the merge came from in the case of a "normal" merge; not so with a "squashed" merge.) (您可以查看树,看看在“正常”合并的情况下合并来自哪个提交;不是“压缩”合并。)

Usually you cherry-pick changes from other branches; 通常你会挑选其他分支的变化; your current branch won't contain a commit you want to cherry-pick. 您当前的分支将不包含您想要挑选的提交。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM