简体   繁体   English

了解为什么 git-filter-branch 没有清理我的历史记录

[英]Understanding why git-filter-branch is not cleaning my history

I used gitleaks to check for leaked secret in my repos history.我使用 gitleaks 检查我的回购历史中泄露的秘密。 When I ran the following command and forced the push当我运行以下命令并强制推送时

git filter-branch --force --index-filter \
  'git rm -r --cached --ignore-unmatch terra/fixtures.go' \
  --prune-empty --tag-name-filter cat -- --all

it seemed to work, except I noticed the following:它似乎工作,除了我注意到以下几点:

WARNING: Ref 'refs/heads/automate_tests' is unchanged
WARNING: Ref 'refs/heads/ethRawTransaction' is unchanged
WARNING: Ref 'refs/heads/feature/177/leave-bastion' is unchanged
WARNING: Ref 'refs/heads/feature/FAQ' is unchanged
WARNING: Ref 'refs/heads/master' is unchanged
WARNING: Ref 'refs/heads/mjolnir' is unchanged
WARNING: Ref 'refs/heads/tmp' is unchanged
WARNING: Ref 'refs/remotes/origin/master' is unchanged
WARNING: Ref 'refs/remotes/origin/automate_tests' is unchanged
WARNING: Ref 'refs/remotes/origin/bug/0.0.11-beta-fix' is unchanged
WARNING: Ref 'refs/remotes/origin/bug/bastion-ssh' is unchanged
WARNING: Ref 'refs/remotes/origin/bug/fix-examples-merge' is unchanged
WARNING: Ref 'refs/remotes/origin/develop' is unchanged
WARNING: Ref 'refs/remotes/origin/ethRawTransaction' is unchanged
WARNING: Ref 'refs/remotes/origin/feature/168/auto-ssh-to-bastion' is unchanged
WARNING: Ref 'refs/remotes/origin/feature/169/ethstats_for_pantheon' is unchanged
WARNING: Ref 'refs/remotes/origin/feature/175/ssh-to-certain-nodes' is unchanged
WARNING: Ref 'refs/remotes/origin/feature/176/tagging-nodes-to-ips' is unchanged
WARNING: Ref 'refs/remotes/origin/feature/177/leave-bastion' is unchanged
WARNING: Ref 'refs/remotes/origin/feature/FAQ' is unchanged
WARNING: Ref 'refs/remotes/origin/feature/README' is unchanged
WARNING: Ref 'refs/remotes/origin/master' is unchanged
WARNING: Ref 'refs/remotes/origin/mjolnir' is unchanged
WARNING: Ref 'refs/remotes/origin/tmp' is unchanged
WARNING: Ref 'refs/tags/0.0.4' is unchanged
WARNING: Ref 'refs/tags/20190820141131-866368a' is unchanged
WARNING: Ref 'refs/tags/20190820142202-bd96767' is unchanged
WARNING: Ref 'refs/tags/20190820143451-fc7f46a' is unchanged
WARNING: Ref 'refs/tags/20190820143903-832818a' is unchanged
WARNING: Ref 'refs/tags/20190820150546-05e3105' is unchanged
WARNING: Ref 'refs/tags/20190820154631-da0cdab' is unchanged
WARNING: Ref 'refs/tags/20190820160956-047caa6' is unchanged
WARNING: Ref 'refs/tags/20190820162243-a300fa5' is unchanged
WARNING: Ref 'refs/tags/20190820170410-47f8878' is unchanged
WARNING: Ref 'refs/tags/untagged-f148f02c4d71ed0bea99' is unchanged
WARNING: Ref 'refs/tags/v.0.0.1' is unchanged
WARNING: Ref 'refs/tags/v0.0.1' is unchanged
WARNING: Ref 'refs/tags/v0.0.1-alpha' is unchanged
WARNING: Ref 'refs/tags/v0.0.10' is unchanged
WARNING: Ref 'refs/tags/v0.0.11-beta' is unchanged
WARNING: Ref 'refs/tags/v0.0.14' is unchanged
WARNING: Ref 'refs/tags/v0.0.3-alpha' is unchanged
WARNING: Ref 'refs/tags/v0.0.4-chaos-poc' is unchanged

As a result, the number of leaks do not seem to be going down.结果,泄漏的数量似乎并没有下降。

I am confused as to why this is happening and would appreciate any pointers.我对为什么会发生这种情况感到困惑,并希望得到任何指示。

The refs that git filter-branch reports as unchanged did not have a file named terra/fixtures.go anywhere in their histories. git filter-branch报告为未更改的参考文献在其历史记录中的任何位置都没有名为terra/fixtures.go的文件。 Filter-branch informs you that although you asked it to update these branch names to point to any copied commits, no commits were actually copied in the process. Filter-branch 通知您,尽管您要求它更新这些分支名称以指向任何复制的提交,但在此过程中实际上没有复制任何提交。

It might be interesting to find a list of reachable commit hash IDs that do have such a file, and then run git branch --contains on such hash IDs.找到具有此类文件的可到达提交hash ID 的列表,然后在此类 hash ID 上运行git branch --contains可能会很有趣。 See below.见下文。

Which commits contain file F?哪些提交包含文件F?

Note that this is a different answer to a different question.请注意,这是对不同问题的不同答案。 It's also not looking for commits in which some path name was modified , but rather for commits in which some path name exists at all .它也不是寻找修改了某些路径名的提交,而是寻找根本存在某些路径名的提交。

We start by using git rev-list to list all commits:我们首先使用git rev-list列出所有提交:

git rev-list --all |

The output from git rev-list is simply a list of every commit hash ID that is reachable from the named revision(s). git rev-list中的 output 只是每个提交 hash ID 的列表,可以从命名的修订版访问。 In this case, --all names all branches and tags, along with other refs such as refs/stash , but not any reflog entries.在这种情况下, --all命名所有分支和标签,以及其他引用,例如refs/stash ,但不命名任何 reflog 条目。

Then, for each commit listed, we want to test whether this commit contains the named file(s).然后,对于列出的每个提交,我们要测试此提交是否包含命名文件。 At this point you generally want a lot of programmability.此时您通常需要很多可编程性。 For instance, suppose the file name is a/b/c.txt .例如,假设文件名是a/b/c.txt Do you want to also find A/B/C.TXT ?您还想找到A/B/C.TXT吗? If you're on Windows or MacOS, you might.如果您使用的是 Windows 或 MacOS,则可能。 If you're on Linux, probably not.如果您使用的是 Linux,可能不是。 Or, maybe you want to find any file whose name starts or ends with some pattern.或者,也许您想查找名称以某种模式开头或结尾的任何文件。

What we'll do here is use git ls-tree -r , which lists out all the file names, and then run them through a search-and-status command such as grep .我们在这里要做的是使用git ls-tree -r ,它列出了所有文件名,然后通过搜索和状态命令运行它们,例如grep Note that grep searches for regular expressions , not glob patterns, so a*b means zero or more a characters followed by a b character and will match the strings "abc.txt", "b", "flobby", and so on: these all have zero or more a s followed by a b .请注意, grep搜索的是正则表达式,而不是 glob 模式,因此a*b表示零个或多个a字符后跟一个b字符,并将匹配字符串“abc.txt”、“b”、“flobby”等:这些都有零个或多个a s 后跟 a b We'll let the actual matched names show through, so that a human can apply further filtering if needed:我们将让实际匹配的名称显示出来,以便人们可以在需要时应用进一步的过滤:

git rev-list --all |
    while read hash; do
        git ls-tree -r $hash > /tmp/files
        if grep -s 'terra/fixtures\.go' /tmp/files; then
            echo "commit ${hash} :"
            grep 'terra/fixtures\.go' /tmp/files
        fi
    done
rm /tmp/files

The output of this set of commands—which you probably should put in a file, and which I have not tested and might contain errors—is a list of commit hash IDs suitable for extraction but also followed by the matched names: you should probably discard matches for, eg, sputerra/fixtures.gobble .这组命令的 output(您可能应该将其放入文件中,而我尚未测试过并且可能包含错误)是提交 hash ID 的列表,适用于提取,但后面还有匹配的名称:您可能应该丢弃匹配例如sputerra/fixtures.gobble

(It's possible to write fancier grep patterns that match more exactly. In this case, anchoring the regular expression with ^ and $ would suffice. In more complicated cases, more complicated regular expressions are required. I leave this to whoever is using the code.) (可以编写更精确匹配的更高级的grep模式。在这种情况下,用^$锚定正则表达式就足够了。在更复杂的情况下,需要更复杂的正则表达式。我把这个留给使用代码的人。 )

Having obtained hash IDs—run the above and redirect to a file, clean up the file, and then extract the more interesting hash IDs—you can then do:获得 hash IDs—运行上述并重定向到文件,清理文件,然后提取更有趣的 hash IDs—然后您可以执行以下操作:

git branch --contains <hash>

on any given commit hash to see which branches contain that particular commit.在任何给定的提交 hash 上查看哪些分支包含该特定提交。 Note that there may be zero or more branches containing any given commit.请注意,可能有零个或多个分支包含任何给定的提交。 For (much) more about that, read and understand Think Like (a) Git .对于(更多)更多信息,请阅读并理解Think Like (a) Git

Try with double quotes尝试使用双引号

git filter-branch --force --index-filter \
  "git rm -r --cached --ignore-unmatch 'terra/fixtures.go'" \
  --prune-empty --tag-name-filter cat -- --all

Try instead the new git filter-repo , which will replace the old git filter-branch or BFG尝试使用新的git filter-repo替代旧的git filter-branch或 BFG

git filter-repo --use-base-name --path terra/fixtures.go --invert-paths

By default, this new command works on all branches.默认情况下,这个新命令适用于所有分支。 Then a git push --all --force , to override the history of the remote repository.然后是git push --all --force ,以覆盖远程存储库的历史记录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM