简体   繁体   中英

Public and private code in a single Git repository

A research group I'm participating in currently hosts all of their code in a private SVN repository. We'd like to open up our code and move most of it onto Github. The problem is, some of the code is sensitive and should not be opened up, but we still want it under version control. At the moment, we have the open code on Github and the private code still in the private SVN repository. Is there a good way to do this in a single Git repository?

With a single git repository, no. What you can do is use git submodules , which allow you to "combine" repositories. Keep your public code on github, create another, privately hosted, git repository for your private code which references the public code as a submodule. Changes made within the public submodule can be pushed up to github, and changes on github can be pulled back down, but changes outside the submodule won't be exposed to the public community. Although the code trees will be merged into a single root you will have to manage commits, pushes, and pulls independently between the separate modules, which many people find cumbersome and problematic, so you should do some experimentation with the workflow before distributing widely.

Git Submodules has moved to here: http://git-scm.com/book/en/Git-Tools-Submodules

The idea is the submodule is a public git repository embedded inside a private git repository. You manage them separately, with the advantage of the private git repository having publict repository files embedded inside of the directory structure.

For example:

 /private-repository /some (private) directory /public-repository /some (public) directory /some other (private) directory

Another option is using git-crypt https://www.agwa.name/projects/git-crypt/

Yep, git submodules seem to solve the problem for us as well. We used to develop open source CMS & premium extensions (paid) in the same branch of our private repository. Now we decided to switch the core development to github public repo and split the development.

Here is how we go.

  1. Create public repository with two branches. master goes for releases, develop goes for actual development.
  2. We created new repositories for each extension we have
  3. We have a private repository with the core files where we go on our development & added all the premium extensions as submodules. It's private, of course. Branch name - dev.

In order to merge the changes we simply cherry-pick from our branch dev to develop, that's it. In this case we have a clean core-related history for the public repository and it's pretty easy to clone just one repository and then recursively update submodules. Takes a bit more time to have this synced, anyhow it's well worth it.

Cheers

Here is a workflow I have used for initial phase of a project.

  1. private-master branch
  2. confidential-removed branch on top of private-master has removed private content - each thing has to be removed only once, not for every release
  3. public-master based on very early commit with no private stuff ever
  4. version-branch on top of public-master
  5. one prepares version-branch by cherry picking, or squash-merging new commits from confidential-removed
  6. merge request from version-branch to public-master is carefully reviewed to make sure that all contributions are ok to make public, diff is human readable
  7. public-master is pushed also to a public remote / fork

Alternatively, the public-feature-branches can be used instead of version-branches and they will go through review first in-house and then in the public repository.

Features of this workflow:

  • there can be different version of the same file in public and private
  • no need for code different folders based on publicity level
  • no need for code in multiple repositories based on publicity level
  • contributions from developers of the public branch can be merged to private, but some conflicts might arise.

Here are the downsides of this approach:

  • requires extra work for a dedicated person and
  • is somewhat error prone too.
  • cherry-picking may reveal hidden dependencies (ie the picked code may use some non-picked code).

If project gets large, number of participants increases, confidentiality gets more critical one should really break the project into two and use the public as a shared backend as a library or plugin for your private.

Nope.

Unless you want to write git hooks to encrypt/decrypt the sourcecode you'll have to live with two repos. When someone clones a git repo they literally make a clone of it, so it would not be possible to make parts of it private without encryption.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM