简体   繁体   English

在Git存储库中分叉文件

[英]Fork a file within a Git repository

I'm working on an R project which currently has the following directory layout: 我正在研究一个R项目,该项目当前具有以下目录布局:

proj1
  |-- file.r

file.r is used to build a statistical model specific to Project 1 (hence proj1 ). file.r用于构建特定于项目1的统计模型(因此为proj1 )。

During the course of development, we will be building numerous models for numerous projects: 在开发过程中,我们将为众多项目构建众多模型:

Work
  |-- proj1
  |     └-- file.r
  |-- proj2
  |     └-- file.r
  :
  └-- projn
        └-- file.r

file.r will be 90% similar between each of the projects, but there will be differences. 每个项目之间的file.r将达到90%,但是会有差异。 My question is, is there a way to create a master file.r file and simply fork it for each project? 我的问题是, 有没有一种方法可以创建一个master file.r文件并为每个项目简单地创建它? That way, a bugfix/enhancement to the master can simply be rebased down to the forks, and the file-specific changes will be simply applied on top. 这样,可以简单地将基础的错误修正/增强重新归结到分叉,并且特定于文件的更改将仅应用在顶部。 My first thought was to use submodules, but I'm not certain how to apply that here. 我最初的想法是使用子模块,但是我不确定如何在这里应用它。 Thanks! 谢谢!

Use a "topic branch" for each project: 对每个项目使用“主题分支”:

git checkout master
git add file.r ;# this is your master template upon which others are based
git commit -m "Committed the master file"

Then for each project: 然后对于每个项目:

git checkout -B <project> master ;# create and checkout <project> branch
<hack away on file.r, commit when you want>
git push origin <project> ;# to share <project> with others

So in practice you end up with master , upon which, say, project1 , project2 , project3 and so forth are based. 因此,在实践中,您最终获得master ,例如project1project2project3等。 Should do exactly what you want and keep it all quite sane. 应该完全按照自己的意愿做,并保持一切理智。

Advantages of this solution over others that encourage multiple repositories: 与鼓励多个存储库的其他解决方案相比,此解决方案的优点:

  1. Easier to manage. 易于管理。 You've only got one repository that in practice has, what, 20-30 branches at most? 实际上,您只有一个存储库,最多有20-30个分支? Sounds like a lot, but with clear labels its simple to know where you are, particularly if you're only managing a small file set. 听起来很多,但带有清晰的标签可以轻松知道您的位置,特别是如果您仅管理一个小的文件集时。
  2. Easy diffs if you're lazy (as I am). 如果您很懒惰(像我一样),轻松进行比较。 You can see the differences in the file between two projects' file.r with git diff projectA projectB -- file.r . 您可以使用git diff projectA projectB -- file.r看到两个项目的file.r之间的文件git diff projectA projectB -- file.r You could do the same with multiple repositories, but it requires a repository specification like git diff projectA/master projectB/master -- file.r . 您可以对多个存储库执行相同的操作,但是它需要一个存储库规范,例如git diff projectA/master projectB/master -- file.r Could get confusing if you have 20-30 project repositories or use submodules. 如果您有20-30个项目存储库或使用子模块,可能会造成混淆。
  3. Easy updates. 轻松更新。 Grabbing updates is as simple as issuing git fetch origin and watching the output. git fetch origin更新就像发出git fetch origin和查看输出一样简单。
  4. Easy clones. 容易克隆。 When setting up a new local repo, you clone a single remote. 设置新的本地存储库时,您将克隆一个远程。 No need to clone origin, then git remote add <project> repositories until you've got them all. 无需克隆源,然后git remote add <project>存储库,直到拥有全部。

Disadvantages (an incomplete list): 缺点(列表不完整):

  1. This method relies on you paying close attention to your checked out branch. 此方法取决于您密切注意已签出的分支。 Nothing about the directory structure will clue you in, so it might not be as obvious what file.r you're viewing at any given moment. 目录结构一无所知,因此在任何给定时刻查看的file.r可能都不那么明显。 That might be a deal breaker. 那可能会破坏交易。 I dunno. 我不知道。 I suppose it depends on your workflow. 我想这取决于您的工作流程。
  2. As KurzedMetal points out in comments, this could get messy fast if you ever need to merge all the projects into one. 正如KurzedMetal在评论中指出的那样,如果您需要将所有项目合并到一个项目中,这可能会很快变得混乱。 As such, I wouldn't recommend it for source code. 因此,我不会推荐它作为源代码。 For distinct R projects, however, this might be less of a concern. 但是,对于不同的R项目,这可能不太重要。

IMO the best way to achieve this is: IMO达到此目的的最佳方法是:

  1. create a library and a repo of your shared code 创建一个库和共享代码的存储库
  2. create a repo for each project 为每个项目创建一个仓库
  3. use git submodule to integrate the shared code to each project 使用git submodule将共享代码集成到每个项目
  4. import the r library and add the project specific code. 导入r库并添加项目特定的代码。

There are ways described in the other answers. 其他答案中有描述的方法。

For example, 例如,

  • use object-oriented-patterns or templates to increase reuse and reduce code. 使用面向对象的模式或模板来增加重用并减少代码。
  • use git branch 使用git分支
  • use git submodule 使用git子模块
  • Finally, when no other way, I use comments in the file header, that it is a fork of another file. 最后,当没有其他方法时, 我在文件头中使用注释,即它是另一个文件的分支。

      Date | Author | Description ------------- | ------------------- | -------------- 05/18/2018 | You | Forked from Other/file.r 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM