简体   繁体   中英

How to completely remove a file from git if you already pushed it to the remote branch, merged to the develop branch and it is not the latest commit

I've seen a lot of tutorials explaining how to do this in different scenarios but seems like many of them talk of the latest commit. So what I need is to remove this sensitive file completely from both branches: feature-branch and develop .

How do I do that?

I've found this recipe:

git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch path_to_file" HEAD

Will it work in my case? Namely, will it completely remove this file from all the branches?

EDIT:

I've decided to use BFG and here's the output it has given me after I run

bfg --delete-files 'filename_of_the_file_to_delete'

output:

Found 273 objects to protect
Found 198 commit-pointing refs : HEAD, refs/heads/develop, refs/heads/develop-with-relative-paths, ...

Protected commits
-----------------

These are your protected commits, and so their contents will NOT be altered:

 * commit 7d97ab00 (protected by 'HEAD') - contains 1 dirty file :
    - src/email/filename_of_the_file_to_delete (16.1 KB)

WARNING: The dirty content above may be removed from other commits, but as
the *protected* commits still use it, it will STILL exist in your repository.

Details of protected dirty content have been recorded here :

/Users/albert/Documents/projects/rjx/rjxfp/.git.bfg-report/2020-08-25/21-50-38/protected-dirt/

If you *really* want this content gone, make a manual commit that removes it, and then run the BFG on a fresh copy of your repo.


Cleaning
--------

Found 486 commits
Cleaning commits:       100% (486/486)
Cleaning commits completed in 699 ms.

Updating 7 Refs
---------------

    Ref                                                    Before     After
    --------------------------------------------------------------------------
    refs/heads/develop                                   | e9c3c4ba | 53c5dd39
    refs/heads/feature-icons-for-top-level-cats          | d7dde80c | 377ae820
    refs/heads/feature-user-profile                      | 7d97ab00 | e3b1b336
    refs/remotes/origin/develop                          | e9c3c4ba | 53c5dd39
    refs/remotes/origin/feature-icons-for-top-level-cats | d7dde80c | 377ae820
    refs/remotes/origin/feature-user-profile             | 7d97ab00 | e3b1b336
    refs/stash                                           | 9fc9a356 | 39945789

Updating references:    100% (7/7)
...Ref update completed in 54 ms.

Commit Tree-Dirt History
------------------------

    Earliest                                              Latest
    |                                                          |
    ..........................................................DD

    D = dirty commits (file tree fixed)
    m = modified commits (commit message or parents changed)
    . = clean commits (no changes to file tree)

                            Before     After
    -------------------------------------------
    First modified commit | 6a211a6b | 35597e71
    Last dirty commit     | e9c3c4ba | 53c5dd39

Deleted files
-------------

    Filename                          Git id
    ------------------------------------------------------
    filename_of_the_file_to_delete    | 4121d724 (16.1 KB)


In total, 37 object ids were changed. Full details are logged here:

    /Users/albert/Documents/projects/rjx/rjxfp/.git.bfg-report/2020-08-25/21-50-38

BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive

The file I wanted to delete is still in the directory where it was created, at least on the branch from which I ran the bfg command. Then I don't think I completely understand what it says about some protected dirty content. Who protected it and why? It says: If you *really* want this content gone, make a manual commit that removes it, and then run the BFG on a fresh copy of your repo.

I don't understand what I should do exactly.

As far as I understand, the commit it mentioned 7d97ab00 (which is now e3b1b336 ) is the last commit on the branch from which I ran the command, so I have to remove the file (but do I have to remove it with rm or git-rm ?) then make a commit and run BFG again?

While I haven't used The BFG, I understand from its documentation that it considers the tip-most commit of each branch to be a "correct state". That is, suppose you want to remove the file secret.txt from every commit. If you use The BFG with the instruction "remove file secret.txt ", it will remove it from all commits except the current commit (and any other branch-tip commit that has that file in it).

Remember that branch names simply identify some commit, by its commit hash ID. It's the commit itself that has the files in it. Every commit has a full snapshot of every file. So if you added secret.txt four commits ago, and have this:

... <-H <-I <-J <-K <-L   <--master

where each uppercase letter stands in for a commit, the file secret.txt is in commits L , K , J , and I . It's not in H because H is five commits ago.

The backwards-pointing arrows here are how Git works: every commit leads backwards to the previous commit. No part of any commit can ever be changed (not by The BFG either), so what The BFG must do is to create new and improved commits, then throw out the old commits entirely.

The BFG will copy existing commit I to a new-and-improved I' in which secret.txt does not exist. It will then copy J to a new-and-improved J' in which secret.txt does not exist, and repeat this for K . But L is the last commit, identified by name, so The BFG assumes that you mean to keep secret.txt there, because it's there now and identified by name. This is the "protected" commit. So The BFG copies L to L' —it has to do this because it copied K to K' and the existing L points back to the existing K —but this time it keeps secret.txt in commit L' .

You end up with:

... <-H [ XXX deleted: <-I <-J <-K <-L ]
       \
        I' <-J' <-K' <-L'   <-- master

in which secret.txt now exists only in that last commit, L' , that was protected.

The documentation for The BFG suggests that you do this:

git rm secret.txt
git commit

before you start , so that you start with:

... <-H <-I <-J <-K <-L <-M   <--master

where new commit M doesn't have secret.txt in it. Now commits I through L can all be fixed-up, because L isn't the last commit any more. It is not identified by name. The name master finds, not L , but M ; only M itself finds L .

Notes

Once you've updated your own repository to have new-and-improved commits, and have thrown out the old bad ones, you will need to use git push --force to get any other Git that still has, and is still using, the old bad commits, to switch to the new and improved commits.

Always assume that if secrets.txt was available on the web for even a few seconds, someone out there grabbed a copy of it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM