简体   繁体   English

git:摇摇欲坠的blob

[英]git: dangling blobs

I recently ran git fsck --lost-found on my repository. 我最近在我的存储库上运行了git fsck --lost-found

I expected to see a couple dangling commits, where I had reset HEAD . 我期待看到几个悬空提交,我已经重置了HEAD

However, I was surprised to see likely over several thousand dangling blob messages. 但是,我惊讶地看到可能有数千条悬空blob消息。

I don't believe anything is wrong with my repository, but I'm curious as to what causes these dangling blobs? 我不相信我的存储库有任何问题,但我很好奇是什么原因导致这些悬空斑块? There's only two people working on the repository, and we haven't done anything out of the ordinary. 只有两个人在存储库上工作,我们没有做任何与众不同的事情。

I wouldn't think they were created by an older version of a file being replaced by a new one, since git would need to hold onto both blobs so it can display history. 我不认为它们是由旧版本的文件替换为新版本创建的,因为git需要保留两个blob以便它可以显示历史记录。

Come to think of it, at one point we did add a VERY large directory (thousands of files) to the project by mistake and then remove it. 想想看,有一次我们错误地将一个非常大的目录(数千个文件)添加到项目中然后将其删除。 Might this be the source of all the dangling blobs? 这可能是所有悬空斑点的来源吗?

Just looking for insight into this mystery. 只是寻找洞察这个谜。

Last time I looked at this I stumbled across this thread , specifically this part: 上次我看到这个时,偶然发现了这个帖子 ,特别是这一部分:

You can also end up with dangling objects in packs. 你也可以在包中悬挂物体。 When that pack is repacked, those objects will be loosened, and then eventually expired under the rule mentioned above. 当重新包装该包时,这些对象将被松开,然后最终根据上述规则到期。 However, I believe gc will not always repack old packs; 但是,我相信gc不会总是重新包装旧包装; it will make new packs until you have a lot of packs, and then combine them all (at least that is what "gc --auto" will do; I don't recall whether just "git gc" follows the same rule). 它会制作新包装,直到你有很多包装,然后将它们全部组合起来(至少这就是“gc --auto”会做的;我不记得是否只是“git gc”遵循相同的规则)。

So it's normal behavior, and does get collected eventually, I believe. 所以这是正常的行为,并且最终会被收集,我相信。

edit: Per Daniel, you can immediately collect it by running 编辑:根据丹尼尔,您可以通过运行立即收集它

git gc --prune="0 days"

我真的很不耐烦和使用:

git gc --prune="0 days"

Whenever you add a file to the index, the content of that file are added to Git's object database as a blob. 无论何时向索引add文件,该文件的内容都将作为blob添加到Git的对象数据库中。 When you then reset / rm --cached that file, the blobs will still exist (they will be garbage collected the next time you run gc ) 当你reset / rm --cached那个文件时,blob仍然存在(下次你运行gc时它们会被垃圾收集)

However, when those files are part of a commit and you decide later to reset history, then the old commits are still reachable from Git's reflog and will only be garbage collected after a period of time (usually a month, iirc). 但是,当这些文件是提交的一部分并且您稍后决定reset历史记录时,旧的提交仍然可以从Git的reflog访问,并且只会在一段时间(通常是一个月,iirc)之后被垃圾收集。 Those objects should not show up as dangling though, since they are still referenced from the reflog. 这些对象不应该显示为悬空,因为它们仍然是从reflog中引用的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM