简体   繁体   English

删除所有Git提交中所有未选中的文件

[英]remove all but selected files in all Git commits

I have several "interesting" files (which I have touched) among all other files in the Git history. 在Git历史记录的所有其他文件中,我有几个“有趣的”文件(我已经触摸过)。 And I want to publish the "interesting" files only with their history as a Git repo, without any other files being present anywhere in the history of this repo. 我只希望将“有趣的”文件及其历史记录作为Git存储库发布,而在此存储库的历史记录中不存在任何其他文件。

How to write a smart script for git filter-branch --index-filter ? 如何为git filter-branch --index-filter编写智能脚本? (Or at least for git filter-branch --tree-filter , which is however undesirable, since it is slower, and my saved trees are huge.) (或者至少对于git filter-branch --tree-filter ,这是不希望的,因为它比较慢,而且我保存的树很大。)

Note that my question is a bit different to the most common similar one people are asking 1 2 : How to remove a specific ("sensitive") file from the Git history? 请注意,我的问题与大多数人问的最相似的问题有点不同1 2如何从Git历史记录中删除特定(“敏感”)文件? I need to remove the complement, and keep the specific files. 我需要删除补码,并保留特定文件。

So, the tricky part in this script for git filter-branch --index-filter is to get the list of files from the index, filter out the specific ones, and then remove the resulting list. 因此,此脚本中git filter-branch --index-filter的棘手部分是从索引中获取文件列表,过滤掉特定文件,然后删除结果列表。

I have implemented this as a separate executable script git-update-index-keeping-only ; 我已经将其实现为单独的可执行脚本git-update-index-keeping-only ; here is the rough implementation: 这是粗略的实现:

git ls-files --full-name \
| fgrep -v -x -f <(echo "$FILELIST") \
| xargs git rm --cached "$@" --

where I haven't thought much about what would happen to newlines and spaces in the filenames (spaces must be a problem for xargs , unless it is told to invoke the command again for each argument, which I didn't do for efficiency). 在这里,我对文件名中的换行符和空格不会怎么想(对于xargs ,空格必须是个问题,除非告诉它为每个参数再次调用该命令,而我并不是为了提高效率),对此我没有太多考虑。

A sample usage is written down in another script useful for my use case: get the list of interesting as those modified or added in the diff between 2 commits (say, an "upstream" commit and your last commit on top of that). 在另一个对我的用例有用的脚本中记录了一个用法示例:在2次提交(例如,“上游”提交和最重要的一次提交)之间的差异中获取那些经过修改或添加的有趣列表。

It's git-filter-only-files-modified-since ; 因为是git-filter-only-files-modified-since ; its essence is like this: 其本质是这样的:

FILES="$(git diff-tree "$SINCE": HEAD: \
   -r --name-only --diff-filter=MACRT)"
export FILES
git filter-branch \
   --index-filter 'echo "$FILES" | git-update-index-keeping-only -q'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM