简体繁体 English

报表的版本控制（git）

[英]Version control for reports (git)

原文 2010-08-06 22:56:48 7 3 git/ version-control/ sas

I have a particular report that I am asked to run from time to time. 我有一份不定期的报告。 The details are slightly different each time - different date ranges, different selection criteria - but structurally, the report is fairly stable. 每次的详细信息略有不同-不同的日期范围，不同的选择标准-但从结构上讲，该报告相当稳定。 I do make some structural changes from time to time, however. 但是，我确实会不时进行一些结构性更改。

I have two hopes for these reports: 对于这些报告，我有两个希望：

1) to be able to reproduce any report at a later date. 1）能够在以后重现任何报告。 2) to be able to review the structural changes made to the report over time. 2）能够随着时间的推移查看对报告所做的结构更改。

Right now, I just have a folder with a master script, which I modify for every iteration of the report, and subfolders where I save a snapshot of the master script and the data for each run. 现在，我只有一个带有主脚本的文件夹，该文件夹针对报表的每次迭代进行修改，还有一个子文件夹，用于保存主脚本的快照以及每次运行的数据。

Maybe that's good enough. 也许足够好了。 But I've started using git to manage my (much more complex) data analysis scripts, and I was wondering if there was a way to use it here (and for myriad similar reports) that would allow for more robust version control. 但是我已经开始使用git来管理我的（更复杂的）数据分析脚本，我想知道是否有一种方法可以在这里（以及许多类似的报告）使用它来提供更强大的版本控制。

I can think of a few different ways to do so: make a branch for each report, but only merge structural changes back onto the master; 我可以想到几种不同的方法：为每个报表创建一个分支，但仅将结构更改合并回主数据库； clone the master into the subfolder for a new report, make changes there, push back structural changes; 将母版克隆到子文件夹中以生成新报告，在此处进行更改，将结构更改后退； etc. But I really don't even know enough to be able to separate insane ideas from plausible ones, much less good ones. 等等。但是我真的不甚了解，无法将疯狂的想法与合理的想法区分开，远不是好想法。 Let me know what you think. 让我知道你的想法。 Thanks. 谢谢。

3 个解决方案

It depends on the report obviously and how it would change but following what you say it does seem to me you can write a good and meaningful SAS Macro program that can have as parameters all your selection criteria. 显然，它取决于报告以及报告的变化方式，但是按照您所说的，对我来说，您可以编写一个优质而有意义的SAS Macro程序，可以将所有选择标准作为参数。 In the SAS macro code you can then evaluate the parameters and make the structural change, if necessary. 然后，在SAS宏代码中，您可以评估参数并根据需要进行结构更改。

So one .sas file with just one big macro in it, depending on the parameters you use to call the macro it can reproduce all the reports you want. 因此，其中只有一个大宏的一个.sas文件，根据用于调用宏的参数，它可以重现您想要的所有报告。

This makes sense to you? 这对您有意义吗？ If it doesn't let me know and I could provide some examples of SAS Macro to get you started if you are not familiar with it. 如果没有让我知道，如果您不熟悉它，我可以提供一些SAS Macro的示例来帮助您入门。

I'd personally go for your first suggestion: 我个人会提出您的第一个建议：

make a branch for each report, but only merge structural changes back onto the master 为每个报表创建一个分支，但仅将结构更改合并回主报表

This is by far the easiest conceptually, and it by merging the structural changes into the head revision, you can apply them as and when required to the other branches (when requested). 从概念上讲，这是最简单的方法，并且通过将结构更改合并到主修订版中，您可以根据需要将其应用于其他分支（需要时）。 The only downside is the amount of branches you'll leave lying around, it sounds like an infrequent request and a good naming scheme should sort that out. 唯一的缺点是您留下的分支数量很多，这听起来像是很少的请求，并且良好的命名方案应该可以解决这个问题。

I have a particular report that I am asked to run from time to time. 我有一份不定期的报告。 The details are slightly different each time - different date ranges, different selection criteria - but structurally, the report is fairly stable. 每次的详细信息略有不同-不同的日期范围，不同的选择标准-但从结构上讲，该报告相当稳定。

If you can anticipate which fields change each time, I would say make a generic report that prompts you for this data each time the report is run. 如果您可以预计每次都会更改哪些字段，那么我想说一个普通的报告，该报告每次运行时都会提示您输入该数据。 You should be able to do this in just about any reporting software. 您应该几乎可以在所有报告软件中执行此操作。 The report itself can be tracked in git, and you won't have to worry about having 50,000 branches in your repository. 可以在git中跟踪报告本身，而您不必担心存储库中有50,000个分支。

If it's unpredictable what fields need to be custom each time, give most of the fields useful default values. 如果无法确定每次需要自定义哪些字段，请为大多数字段提供有用的默认值。

If you run this report a lot, and are specifically interested in keeping track of the various result sets , I'd suggest a different approach. 如果您经常运行此报告，并且对跟踪各种结果集特别感兴趣，则建议使用其他方法。 I don't know what your report generates, but let's say it's a PDF. 我不知道您的报告生成了什么，但可以说它是PDF。 I would make a directory structure somewhere, and you could store each run in results/year/month/date.pdf . 我将在某个位置创建目录结构，您可以将每次运行存储在results/year/month/date.pdf 。 This way you will have a record of the data pulled on May 5, 2010 (or with May 5, 2010 as a parameter). 这样，您将记录2010年5月5日（或将2010年5月5日作为参数）提取的数据。

Edit: You might consider tags instead of branches for those things you can't combine into a single report. 编辑：对于那些您无法合并到单个报告中的内容，您可以考虑使用标签而不是分支。 If you have a version you think you're going to need quick access to, tag it. 如果您有版本，则认为您需要快速访问，请对其进行标记。 Any time you need to get back to it, just check out the tag and run the report. 任何时候您需要重新使用它，只需签出标签并运行报告即可。