[英]Why does Git merge the index file instead of completely overriding it when checking out a commit
This is the question about Git internals.这是关于 Git 内部的问题。 I use low-level commands here and don't use branches.我在这里使用低级命令并且不使用分支。
echo f1 > f1.txt
echo f1 > f1.txt
git add .
git commit -m "first"
...cf178d5
Now I want to create a new commit with one file f3.txt
using index and write-tree
command:现在我想使用 index 和write-tree
命令创建一个包含一个文件f3.txt
的新提交:
$ rm f1.txt f2.txt
$ echo ‘f3 content’ > f3.txt
$ git add .
So currently the index file and the directory contains only new f3.txt
file:所以目前索引文件和目录只包含新的f3.txt
文件:
$ git ls-files -s
100644 [some hash] 0 f3.txt
$ ls
f3.txt
So I write the tree to the repository and update HEAD so with the new commit hash:因此,我将树写入存储库并使用新的提交哈希更新 HEAD:
LATEST_TREE_HASH=$( git write-tree )
echo $LATEST_TREE_HASH > .git/HEAD
If I now run git status
I get:如果我现在运行git status
我得到:
$ git status
Not currently on any branch.
nothing to commit, working directory clean
If I now check out first
commit with two files f1.txt
and f2.txt
:如果我现在用两个文件f1.txt
和f2.txt
检查first
提交:
$ git checkout cf178d5
A f3.txt <--------------- why?
HEAD is now at a27a75a... initial commit
Git works fine but I believe it merges trees in the index instead of overriding. Git 工作正常,但我相信它合并索引中的树而不是覆盖。 You can see it from the git checkout
output that it treats f3.txt
as added file and if I check the index file contents:您可以从git checkout
输出中看到它将f3.txt
视为添加文件,如果我检查索引文件内容:
$ git ls-files -s
100644 [some hash] 0 f1.txt
100644 [some hash] 0 f2.txt
100644 [some hash] 0 f3.txt
$ ls
f1.txt f2.txt f3.txt
It shows three files.它显示三个文件。 What is the reason for this behavior?这种行为的原因是什么?
Edit : the question has changed enough to invalidate the previous response.编辑:问题已经改变到足以使之前的回复无效。
There's still a typo ( f1.txt
listed twice) and funky non-ASCII Unicode quote marks, but we can now see what is going wrong here:仍然有一个错字( f1.txt
列出了两次)和时髦的非 ASCII Unicode 引号,但我们现在可以看到这里出了什么问题:
$ LATEST_TREE_HASH=$( git write-tree ) $ echo $LATEST_TREE_HASH > .git/HEAD
This is a bit of a problem.这有点问题。 As Mark Adelsberger noted in a comment and your script says by using the word TREE
here, git write-tree
writes a tree , not a commit.正如Mark Adelsberger 在评论中指出的那样,您的脚本在这里使用TREE
一词表示, git write-tree
写的是一棵树,而不是提交。
What's in .git/HEAD
is supposed to be exactly one of two things: .git/HEAD
内容应该是以下两件事之一:
ref: refs/heads/ name
, where name
is a valid branch name, or ref: refs/heads/ name
形式的字符串,其中name
是有效的分支名称,或 In turn, a branch name—a reference of the form refs/heads/ name
— must always point to a commit object, never to a blob, tree, or tag object.反过来,分支名称( refs/heads/ name
形式的引用)必须始终指向提交对象,而不能指向 blob、树或标记对象。
This means that Git in general assumes that whatever comes out of .git/HEAD
, it refers to a commit object.这意味着 Git 通常假设无论来自.git/HEAD
内容都指向一个提交对象。 By writing this tree hash into .git/HEAD
you've violated this assumption.通过将此树哈希写入.git/HEAD
您违反了这个假设。 However, to allow for "unborn branches", such as the state of an initial repository with no master
yet, HEAD
can contain the name of a branch that does not actually exist.但是,为了允许“未出生的分支”,例如还没有master
的初始存储库的状态, HEAD
可以包含实际不存在的分支的名称。
What happens next is, I think, not guaranteed.接下来会发生什么,我认为,不能保证。 The git checkout
command assumes that if HEAD
contains a valid hash, it contains a commit hash, and the only other allowed possibility is that HEAD
contains the name of an orphan branch. git checkout
命令假设如果HEAD
包含一个有效的哈希,它包含一个提交哈希,唯一允许的另一个可能性是HEAD
包含一个孤立分支的名称。 So we run git checkout target_hash
, as in your example:因此,我们运行git checkout target_hash
,如您的示例所示:
git checkout cf178d5
Suppose HEAD
contained a valid commit hash.假设HEAD
包含一个有效的提交哈希。 Let's call this the old hash, as distinguished from the target commit hash.让我们称其为旧哈希,以区别于目标提交哈希。 In this case, git checkout
would:在这种情况下, git checkout
将:
Obviously --force
disables the check, but this is the basic process by which both staged and unstaged modifications are carried from one checkout to another when switching branches without being in a "clean" state .显然--force
会禁用检查,但这是在切换分支而不处于“干净”状态时,将暂存和未暂存修改从一个检出进行到另一个检出的基本过程。 The process is described in all its gory detail in the Two Tree Merge section of the git read-tree
documentation . git read-tree
文档的两棵树合并部分详细描述了该过程。
The other possibility allowed by the rules is that you are currently on an orphan branch.规则允许的另一种可能性是您目前在孤儿分支上。 In this case, there is no current commit.在这种情况下,当前没有提交。 Most likely, Git simply uses the empty tree as if it were the current commit.最有可能的是,Git 只是使用空树,就好像它是当前提交一样。 It then follows the same rules for case 1, which is now allowed since it has a tree.然后它遵循与案例 1 相同的规则,现在允许使用,因为它有一个树。
But this is, obviously, not guaranteed.但这显然不能保证。 If Git were to use the current (valid) tree stored in .git/HEAD
as the base tree, instead of the empty tree, and then proceed as for case 1, you'd see your two files get removed.如果 Git 使用存储在.git/HEAD
的当前(有效)树作为基础树,而不是空树,然后按照案例 1 进行操作,您会看到您的两个文件被删除。 Follow all the sub-cases outlined in git read-tree
with $H
set to your existing tree, vs $H
set to the empty tree.遵循git read-tree
列出的所有子案例,其中$H
设置为现有树,而$H
设置为空树。 (I admit to not having done so, but I think this is where the behavior comes from. But see also the remark about case 3 in the read-tree documentation!) (我承认没有这样做,但我认为这是行为的来源。但另请参阅 read-tree 文档中关于 case 3 的评论!)
1 Git actually achieves this using a temporary index , stored in the index.lock
file. 1 Git 实际上使用临时索引来实现这一点,它存储在index.lock
文件中。 If all goes well, the temporary index is renamed to become the regular index, unlocking the index in the process.如果一切顺利,临时索引被重命名为常规索引,在此过程中解锁索引。 If things go poorly, Git removes the temporary index.lock
file, discarding the temporary index and unlocking the index.如果事情进展不顺利,Git 会删除临时index.lock
文件,丢弃临时索引并解锁索引。
There's another set of funky non-ASCII quote marks that made cut and paste of your instructions fail, so that when I made the normal first commit I ended up with two files:还有一组时髦的非 ASCII 引号使您的指令的剪切和粘贴失败,因此当我进行正常的第一次提交时,我最终得到了两个文件:
$ git commit -m "second commit"
[master (root-commit) b9c7e4b] second commit
2 files changed, 1 insertion(+)
create mode 100644 f3.txt
create mode 100644 f3.txtcontentecho
$ ls
f3.txt f3.txtcontentecho
$ git ls-files -s
100644 5927d85c2470d49403f56ce27afd8f74b1a42589 0 f3.txt
100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 f3.txtcontentecho
But note that my ls
vs ls-files -s
output differs enormously from yours at this point:但请注意,此时我的ls
vs ls-files -s
输出与您的有很大不同:
So currently the index file and the directory contains only new f3.txt file:所以目前索引文件和目录只包含新的 f3.txt 文件:
$ git ls-files -s 100644 [some hash] 0 f3.txt $ ls f1.txt f2.txt
It's not at all clear to me why you would have files f1.txt
and f2.txt
in your work-tree now;我完全不清楚为什么现在你的工作树中会有文件f1.txt
和f2.txt
; I don't.我不知道。
Now we create a commit with git commit-tree
and run git checkout
:现在我们使用git commit-tree
创建一个提交并运行git checkout
:
$ INITIAL_COMMIT_HASH=$( \
> echo 'initial commit' | git commit-tree $INITIAL_TREE_HASH )
$ git checkout $INITIAL_COMMIT_HASH
but what I get is very different:但我得到的是非常不同的:
Note: checking out 'cd1bc16160c8a2814cd94bc8397230ffe5a16c22'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
HEAD is now at cd1bc16... initial commit
and everything reads as I would expect (files f1.txt
and f2.txt
are in the work-tree and the index; neither of the f3
files are visible).一切都按我的预期读取(文件f1.txt
和f2.txt
在工作树和索引中; f3
文件都不可见)。 Running git log --graph --all
shows the expected two (disconnected) commits (both are root commits, with no parents).运行git log --graph --all
显示预期的两个(断开连接的)提交(都是根提交,没有父提交)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.