简体   繁体   English

如何减少 Bitbucket 上的 git repo 大小?

[英]How to reduce git repo size on Bitbucket?

Summary of my problem: One of my private repositories on Bitbucket suddenly more than doubled in size after I pushed an addition of a few hundred bytes to two existing files.我的问题摘要:在向两个现有文件添加几百个字节后,我在 Bitbucket 上的一个私人存储库的大小突然增加了一倍多。 The repo is now over 2GB, which has caused Bitbucket to put it into read-only mode.存储库现在超过 2GB,这导致 Bitbucket 将其置于只读模式。 Because it is in read-only mode, I cannot push changes that would reduce the repo size.因为它处于只读模式,所以我无法推送会减少 repo 大小的更改。 (Catch 22.) (第 22 条。)

Details: My company recently began hosting git repositories on Bitbucket.详细信息:我的公司最近开始在 Bitbucket 上托管 git 存储库。 One of the repositories I am in charge of had a size of about 973MB, which was uncomfortably close to the 1GB soft limit.我负责的其中一个存储库的大小约为 973MB,令人不安地接近 1GB 的软限制。 To reduce the repo size, I followed the instructions in the Bitbucket documentation article Split a repository in two and moved about 450MB worth of documentation and online help files into their own private repo.为了减少存储库的大小,我按照 Bitbucket 文档文章将存储库分成两部分中的说明操作,并将价值约 450MB 的文档和在线帮助文​​件移动到他们自己的私有存储库中。 I then followed the instructions in the Bitbucket documentation articles Reduce repository size and Maintaining a git repository , specifically:然后我按照 Bitbucket 文档文章Reduce repository sizeMaintaining a git repository 中的说明进行操作,特别是:

git count-objects -vH showed me a size-pack of about 973MB. git count-objects -vH向我展示了一个大约 973MB 的大小包。

I ran git filter-branch --index-filter 'git rm --cached --ignore-unmatch doc' HEAD to remove the doc directory (which is the content I'd moved to the new repo).我运行git filter-branch --index-filter 'git rm --cached --ignore-unmatch doc' HEAD来删除 doc 目录(这是我移动到新仓库的内容)。

I ran the following commands to expire reference and prune:我运行了以下命令来使引用过期和修剪:

git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
git reflog expire --expire=now --all
git gc --prune=now

git count-objects -vH then showed me a size-pack of 881.1 MiB and du -sh .git/objects returned 882M. git count-objects -vH然后向我展示了一个 881.1 MiB 的大小包, du -sh .git/objects返回了 882M。 I was disappointed that moving over 450MB reduce the repo size by less than 90MB, but pushed the changes to Bitbucket nevertheless:我很失望移动超过 450MB 将 repo 大小减少了不到 90MB,但仍然将更改推送到 Bitbucket:

git push --all --force
git push --tags --force

The settings page for the Bitbucket copy of the repo continued to show a size of 973MB.该存储库的 Bitbucket 副本的设置页面继续显示 973MB 的大小。 I logged out, refreshed the browser, logged back in, but that didn't help -- the repo size remained at 973MB.我注销,刷新浏览器,重新登录,但这没有帮助——repo 大小保持在 973MB。

This morning (three days after the changes described above) I made a couple of minor additions to two existing files which increased the files' sizes by a total of less than 1KB, added and commited them to my local repo, then pushed the change to Bitbucket.今天早上(上述更改三天后)我对两个现有文件进行了一些小添加,将文件的大小总共增加了不到 1KB,将它们添加并提交到我的本地存储库,然后将更改推送到比特桶。 A few minutes later I took a look at the Bitbucket page for the repo and saw a red warning banner informing me "This repo is over the 2 GB limit and is in read-only mode."几分钟后,我查看了该存储库的 Bitbucket 页面,看到一个红色警告横幅,通知我“此存储库超过 2 GB 限制并且处于只读模式。” The settings page now says the repo has a size of 2.3 GB.设置页面现在显示 repo 的大小为 2.3 GB。

The push of a few hundred bytes added to two files was definitely the only activity to occur on the remote repo in the last three days, according to Bitbucket.根据 Bitbucket 的说法,向两个文件添加几百字节的推送绝对是过去三天在远程存储库上发生的唯一活动。 That push may not have been the cause of the repo more than doubling in size, but the two events were closely correlated in time.这种推动可能不是回购规模增加一倍以上的原因,但这两个事件在时间上密切相关。

git reflog show returns nothing. git reflog show返回任何内容。

Cloning a new copy into an alternate directory, then running git count-objects give me a size-pack of 881.29 MiB.将新副本克隆到备用目录中,然后运行 ​​git count-objects 会得到 881.29 MiB 的大小包。

The local repository is on a CentOS 6.5 system.本地存储库位于 CentOS 6.5 系统上。 git version is 1.8.5.3. git 版本是 1.8.5.3。

Questions问题

  1. Why did moving 450MB of files out of the repo only reduce the size of my local repo by 90MB?为什么将 450MB 的文件移出 repo 只会将我的本地 repo 的大小减少 90MB?
  2. Why did even that modest reduction not get pushed to the remote repo on Bitbucket?为什么即使是适度的减少也没有被推送到 Bitbucket 上的远程存储库?
  3. How on Earth did the remote repo size jump from 973MB to 2.3GB?远程仓库的大小究竟是如何从 973MB 跃升到 2.3GB 的?
  4. How do I fix it?我如何解决它? I cannot push to the remote repo even with the --force flag.即使使用 --force 标志,我也无法推送到远程仓库。 Any push gets me the error message "conq: repository is in read only mode (over 2 GB size limit). fatal: Could not read from remote repository."任何推送都会让我收到错误消息“conq:存储库处于只读模式(超过 2 GB 大小限制)。致命:无法从远程存储库读取。”

I've found that the easiest way to reduce the Bitbucket repo size if you are over the 2GB limit is to我发现如果超过 2GB 限制,减少 Bitbucket 存储库大小的最简单方法是

  1. Create a branch on Bitbucket在 Bitbucket 上创建一个分支
  2. Delete that branch on Bitbucket删除 Bitbucket 上的那个分支

This should trigger Bitbucket to run git gc on the repo.这应该会触发 Bitbucket 在 repo 上运行git gc

After conferring with Bitbucket technical support, I can now answer some of my own questions:在与 Bitbucket 技术支持商讨后​​,我现在可以回答我自己的一些问题:

  1. Why did moving 450MB of files out of the repo only reduce the size of my local repo by 90MB?为什么将 450MB 的文件移出 repo 只会将我的本地 repo 的大小减少 90MB? Something in the history got missed.历史上的某些东西被遗漏了。 I don't what exactly, but the filter-branch command missed something.我不知道到底是什么,但是 filter-branch 命令遗漏了一些东西。 I was able to successfully reduce the repo size by 450MB by running the utility BFG Repo-Cleaner .通过运行实用程序BFG Repo-Cleaner ,我能够成功地将 repo 大小减少了 450MB。
  2. Why did even that modest reduction not get pushed to the remote repo on Bitbucket?为什么即使是适度的减少也没有被推送到 Bitbucket 上的远程存储库? It did, but Bitbucket support must then run git gc on their side.确实如此,但 Bitbucket 支持必须在他们身边运行 git gc。 One can contact Bitbucket request and ask them to run git gc on a repo.可以联系 Bitbucket 请求并要求他们在 repo 上运行 git gc。
  3. How on Earth did the remote repo size jump from 973MB to 2.3GB?远程仓库的大小究竟是如何从 973MB 跃升到 2.3GB 的? Unknown.未知。 Bitbucket technical support didn't have the answer to this one either. Bitbucket 技术支持也没有答案。
  4. How do I fix it?我如何解决它? Contact Bitbucket support.联系 Bitbucket 支持。 They can put a repository back into read-write mode so that you can push a smaller repository and they can run git gc on their end.他们可以将存储库恢复为读写模式,以便您可以推送较小的存储库,并且他们可以在其末端运行 git gc。

First of all check the repository size in your local using the following command :-首先,使用以下命令检查本地存储库的大小:-

git count-objects -Hv

We can use following commands我们可以使用以下命令

git reflog expire --expire="1 hour" --all
git reflog expire --expire-unreachable="1 hour" --all
git prune --expire="1 hour" -v
git gc --aggressive --prune="1 hour"

Now , again use the command git count-objects -Hv to notice the change in the size and garbage of repository现在,再次使用命令git count-objects -Hv来注意存储库大小和垃圾的变化

How on Earth did the remote repo size jump from 973MB to 2.3GB?远程仓库的大小究竟是如何从 973MB 跃升到 2.3GB 的?

This is a known bug on bitbucket cloud side, see BCLOUD-19794 .这是 bitbucket 云端的一个已知错误,请参阅BCLOUD-19794

Garbage file is intermittently counted in the repository size.垃圾文件间歇性地计入存储库大小。

When pushing to the remote repository a GC is triggered afterwards which generates a garbage file.当推送到远程存储库时,随后会触发 GC,从而生成垃圾文件。 This garbage file is cleared on the next subsequent GC.这个垃圾文件在下一次后续 GC 中被清除。 Between those two GC's the size of the repository is displayed incorrectly within Bitbucket UI as the garbage file size is intermittently counted towards the repository total size.在这两个 GC 之间,存储库的大小在 Bitbucket UI 中显示不正确,因为垃圾文件大小间歇性地计入存储库总大小。

As noted in the workaround section, you need to contact bitbucket to manually run the GC.如变通方法部分所述,您需要联系 bitbucket 以手动运行 GC。

Bitbucket might take action sooner rather than later if enough people go vote for it.如果有足够多的人投票支持,Bitbucket 可能会尽早采取行动。

As I am sure those familar with got already know, but git stores your version history for files, so making changes and pushing files will not reduce your repo size.我相信那些熟悉的人已经知道了,但是 git 会存储您的文件版本历史记录,因此进行更改和推送文件不会减少您的存储库大小。

There are still several ways to reduce repo sizes on bitbucket, GitHub, gitlab, etc. The best way is to delete branches, as that permanently deletes any files being recorded by that branch, as long as it is not being tracked by another.还有几种方法可以减少 bitbucket、GitHub、gitlab 等上的 repo 大小。最好的方法是删除分支,因为这会永久删除该分支记录的任何文件,只要它不被另一个分支跟踪即可。 But you may want the latest files in that branch, so do the following:但是您可能需要该分支中的最新文件,因此请执行以下操作:

  1. On local machine, create a duplicate repo.在本地机器上,创建一个重复的 repo。 (Backup, so you don't lose info) (备份,所以你不会丢失信息)
  2. Delete a branch that you want to move, or create a fresh version of.删除要移动的分支,或为其创建新版本。 You can use --cached to delete remote branch.您可以使用--cached删除远程分支。
  3. If you want to refresh branch, you can copy files into new branch and push.如果要刷新分支,可以将文件复制到新分支并推送。
  4. If you want to create new remote repo, you can do that too.如果你想创建新的远程仓库,你也可以这样做。

Depending on host, you may have to run special commands, but this should work in most cases.根据主机的不同,您可能必须运行特殊命令,但这在大多数情况下应该有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM