简体   繁体   English

github存储库的本地缓存?

[英]local cache for a github repository?

We use github to manage a great deal of our software environment, and I would wager that like many other orgs the overwhelming majority of traffic to/from that repo comes from our office.我们使用 github 来管理我们的大量软件环境,我敢打赌,与许多其他组织一样,进出该 repo 的绝大多数流量来自我们的办公室。 With that in mind, is there a way to build a local cache of a given github repository, but still have the protection of the cloud version?考虑到这一点,有没有办法构建给定 github 存储库的本地缓存,但仍然具有云版本的保护? I'm thinking of this in the model of a caching proxy server, where the local server (presumably in our building, on our local network) would handle the vast majority of cloning/pull operations.我在缓存代理服务器的模型中考虑了这一点,其中本地服务器(大概在我们的建筑物中,在我们的本地网络上)将处理绝大多数克隆/拉取操作。

This seems like it should be doable, but searching for this has been very difficult, I think in no small part because the words "local" and "cache" have overloaded meanings especially for git(hub) questions.这似乎应该是可行的,但是搜索这个非常困难,我认为在很大程度上是因为“本地”和“缓存”这两个词的含义超载,尤其是对于 git(hub) 问题。

Your latest comment makes it clear you're looking for a performance optimization.您的最新评论清楚地表明您正在寻找性能优化。 That helps.这有帮助。

You can start by creating a local mirror of the github repository following these instructions .您可以首先按照这些说明创建 github 存储库的本地镜像。 You can either periodically update it, or arrange to receive web hooks from github to update the local mirror "on demand".您可以定期更新它,也可以安排从 github 接收网络钩子以“按需”更新本地镜像。 To do this you would need to set up a small web service that would respond to the hooks from github.为此,您需要设置一个小型 Web 服务来响应来自 github 的钩子。 You can add a web hook by going to https://github.com/someuser/someproject/settings/hooks/new .您可以通过转到https://github.com/someuser/someproject/settings/hooks/new添加一个网络钩子。 You will probably want to select the "Let me select individual events" radio button, and then select:您可能希望选择“让我选择单个事件”单选按钮,然后选择:

  • delete删除
  • push
  • create创建

This would keep your cache up-to-date with respect to changes in available tags and branches.这将使您的缓存在可用标签和分支的更改方面保持最新。

Set up a git server that makes that repository available locally.设置一个 git 服务器,使该存储库在本地可用。 This can be as simple as running git daemon , or a local account accessible via ssh, or something more full featured, depending on your local requirements.这可以像运行git daemon一样简单,或者可以通过 ssh 访问本地帐户,或者功能更齐全的东西,具体取决于您的本地要求。

Then you would set up your local working copies like this:然后你可以像这样设置你的本地工作副本:

$ git clone http://localrepository/someproject.git
$ cd someproject
$ git remote set-url --push http://github.com/someuser/someproject.git

This would set up each repository to pull from your local cache, but push changes upstream to github.这会将每个存储库设置为从本地缓存中提取,但将更改推送到上游到 github。

You should check out the git-cache-http-server project .您应该查看git-cache-http-server 项目 I think it partly implements what you need (and is similar to the idea from @larsks post).我认为它部分实现了您的需要(并且类似于@larsks 帖子中的想法)。

It is a NodeJS piece of software that runs an HTTP server to provide you access to locally cached git repositories.它是一个 NodeJS 软件,运行 HTTP 服务器,为您提供对本地缓存的 git 存储库的访问。 The server automatically does fetch upstream changes when required.服务器会在需要时自动获取上游更改。 If you use those local git repositories instead of the distant ones, your git client will be served locally cached content.如果您使用那些本地 git 存储库而不是远程存储库,您的 git 客户端将获得本地缓存的内容。

If you run the git-cache-http-server on a separate host (VM or container for example), you can configure your local git client to automatically clone and fetch from the cache by configuring it to replace https://github.com with something like http://gitcache/github.com .如果您在单独的主机(例如 VM 或容器)上运行 git-cache-http-server,您可以将本地 git 客户端配置为自动克隆并从缓存中获取,方法是将其配置为替换https://github.com使用类似http://gitcache/github.com东西。 This can be achieved by a configuration like:这可以通过如下配置来实现:

git config --global url."http://gitcache:1234/".insteadOf https://

At the moment, this software only provides a cache to clone and update a repository, there is no provision for pushing changes back.目前,该软件仅提供用于克隆和更新存储库的缓存,没有将更改推回的规定。 For some use cases, thinking about a CI infrastructure that needs to pull content of multiple repositories even when only a single one has changed or the automated testing you mention, this can be useful.对于某些用例,考虑到即使只有一个存储库发生更改或您提到的自动化测试也需要提取多个存储库内容的 CI 基础架构,这可能很有用。

查看git clone --reference-if-able从另一个(在您的情况下是现场)存储库中获取对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM