简体   繁体   English

使用 git filter-repo 从 Git repo 历史记录中删除所有文件,文件名中的路径具有转义 \

[英]Remove all files from Git repo history with path having escape \ in filename with git filter-repo

I have special filenames with escape \ characters stored in Git repository on Debian 10 Linux.我在 Debian 10 Linux 上的 Git 存储库中存储了带有转义字符的特殊文件名。

Problem: it is not possible to git checkout files on Windows, which have incompatible characters in the filename.问题:无法在 Windows 上 git 签出文件,文件名中包含不兼容的字符。

Example:例子:

git log --all --name-only -m --pretty= '*\\*'
"systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"

I get following Git errors at Windows checkout:我在 Windows 结帐时收到以下 Git 错误:

C:\Git\bin\git.exe reset --hard "5ef1cac3a03304c35b455edf32bd1bb78060c5b9" --
error: invalid path 'systemd/system/default.target.wants/snap-git\x2dfilter\x2drepo-7.mount'
fatal: Could not reset index file to revision '5ef1cac3a03304c35b455edf32bd1bb78060c5b9'.
Done

Problem reproducing steps:问题复现步骤:

# Clone repository, to be executed on a safe repo:
git clone --no-local /source/repo/path/ /target/path/to/repo/clone/
# Cloning into '/target/path/to/repo/clone'...
# remote: Enumerating objects: 9534, done.
# remote: Counting objects: 100% (9534/9534), done.
# remote: Compressing objects: 100% (4776/4776), done.
# remote: Total 9534 (delta 4215), reused 8043 (delta 3136), pack-reused 0
# Receiving objects: 100% (9534/9534), 7.41 MiB | 16.78 MiB/s, done.
# Resolving deltas: 100% (4215/4215), done.

cd /target/path/to/repo/clone/

# List the files with escape \ from repo history into a list file:
git log --all --name-only -m --pretty= '*\\*' | sort -u >/opt/git_repo_files_w_escape.txt

# Remove the files with escape \ from repo history:
git filter-repo --invert-paths --paths-from-file /opt/git_repo_files_w_escape.txt
Parsed 592 commits
New history written in 0.25 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 71128f3 .gitignore: ADD snap-git to be ignored
Enumerating objects: 9354, done.
Counting objects: 100% (9354/9354), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3694/3694), done.
Writing objects: 100% (9354/9354), done.
Total 9354 (delta 4085), reused 9354 (delta 4085), pack-reused 0
Completely finished after 0.55 seconds.


# List files with escape \ to check result:
git log --format="reference" --name-status --diff-filter=A '*\\*'
# "systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"

#  Unfortunately it seems filter-repo was executed, but log still lists filenames with escape \ :-( 

Question:问题:

1) How to remove all files from Git repo history with path having at least one escape \ character in filename? 1) 如何从 Git 回购历史中删除文件名中至少有一个转义字符的所有文件?

(reason: it is not possible to checkout those files on Windows, which have incompatible characters in the filename) (原因:无法在 Windows 上检出那些文件名中包含不兼容字符的文件)

UPDATE1:更新1:

Tried to replace \\x2d string to - in input file list as suggested, but git history remove was still unsuccessful:尝试按照建议将\\x2d字符串替换为 - 在输入文件列表中,但 git 历史记录删除仍然不成功:

# List the files with escape \ from repo history into a list file:
git log --all --name-only -m --pretty= '*\\*' | sort -u >/opt/git_repo_files_w_escape.txt

# Replace \\x2d string to - in git_repo_files_w_escape.txt:
sed -i 's/\\\\x2d/-/g' /opt/git_repo_files_w_escape.txt

# Remove the listed files from repo history:
git filter-repo --invert-paths --paths-from-file /opt/git_repo_files_w_escape.txt
Parsed 592 commits
New history written in 0.25 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 71128f3 .gitignore: ADD snap-git to be ignored
Enumerating objects: 9354, done.
Counting objects: 100% (9354/9354), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3694/3694), done.
Writing objects: 100% (9354/9354), done.
Total 9354 (delta 4085), reused 9354 (delta 4085), pack-reused 0
Completely finished after 0.55 seconds.


# List files with escape \ to check result:
git log --format="reference" --name-status --diff-filter=A '*\\*'
# "systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"

#  Unfortunately log still lists filenames with \\x2d :-(

UPDATE2:更新2:

Tried to replace \\x2d in git_repo_files_w_escape.txt to \\\\x2d or \x2d but none of them resulted to remove the files having \\x2d in filename from Git history.试图将 git_repo_files_w_escape.txt 中的\\x2d替换为\\\\x2d\x2d但没有一个导致从 Git 历史记录中删除文件名中具有\\x2d的文件。

UPDATE3:更新3:

I'm looking for a working solution based on git filter-repo.我正在寻找基于 git filter-repo 的工作解决方案。

Any more idea?还有什么想法吗?

fwiw, this worked on a linux system, this allowed me to rewrite the HEAD commit without having the files checked out on disk: fwiw,这在 linux 系统上工作,这允许我重写 HEAD 提交,而无需在磁盘上签出文件:

git ls-files | grep -a -e '\\' | while read f; do
    f=$(echo $f | sed -e 's|"||g')
    new=$(echo "$f" | sed -e 's|\\\\x2d|-|g')
    git show "@:$f" > $new
    git rm --cached "$f"
    git add "$new"
done

git status
git commit --amend

The same commands should work on git-bash for windows.相同的命令应该适用于 windows 的git-bash

Assuming you have many files that you want to fix scattered in the hierarchy, a solution with git filter-repo looks tedious.假设您有许多要修复的文件分散在层次结构中,使用git filter-repo的解决方案看起来很乏味。 You can instead use a combination of git fast-export and git fast-import to modify file names in the whole history.您可以结合使用git fast-exportgit fast-import来修改整个历史记录中的文件名。

git fast-export --no-data --all > exported

Now delete the file entries containing a backslash:现在删除包含反斜杠的文件条目:

grep -v '^[DM] .*\\' exported > fixed

Instead of removing the files, you can also modify the file names.除了删除文件,您还可以修改文件名。 For example, to replace the backslash by a dash - , you could try this:例如,要用破折号-替换反斜杠,你可以试试这个:

sed -e '/^[DM] /s,\\,-,g' < exported > fixed

You may now investigate the difference between the two files to ensure that no commit messages were modified:您现在可以调查这两个文件之间的区别,以确保没有修改提交消息:

diff -u exported fixed | less

Now attempt to import the modified history:现在尝试导入修改后的历史:

git fast-import < fixed

This will stop with an error that tells you that the branches will not be modified because the old branch heads are not subsets of the new heads.这将停止并出现错误,告诉您分支将不会被修改,因为旧分支头不是新分支头的子集。 If there are no other errors, you can now force the modification:如果没有其他错误,您现在可以强制修改:

git fast-import --force < fixed

You fed bad input into filter-repo, based on a common but incorrect assumption about how git log works.基于关于 git 日志如何工作的常见但不正确的假设,您将错误的输入输入到 filter-repo 中。

Look at your own output:看自己的output:

$ git log --format="reference" --name-status --diff-filter=A '*\\*'
"systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"

Let's look at the first line as an example.让我们以第一行为例。 If you were to store that in a file, which you pass to --paths-from-file, then git-filter-repo is going to be looking for a file named "systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount" to remove.如果你要将它存储在一个文件中,你传递给 --paths-from-file,那么 git-filter-repo 将寻找一个名为"systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"删除。 You have no such file in your repository.您的存储库中没有这样的文件。 Instead you have one named systemd/system/default.target.wants/snap-git\x2dfilter\x2drepo-7.mount .相反,您有一个名为systemd/system/default.target.wants/snap-git\x2dfilter\x2drepo-7.mount (Note that I have removed both " characters and two of the \ characters.) (请注意,我已经删除了两个"字符和两个\字符。)

The problem here is that you assumed git log would list filenames as-is, which it won't do whenever there are special characters.这里的问题是您假设 git 日志会按原样列出文件名,但只要有特殊字符,它就不会这样做。 You can often get around this by setting core.quotepath=false (this particularly helps when you have non-ascii characters), but even that is ignored when you have backslashes.你通常可以通过设置 core.quotepath=false 来解决这个问题(这在你有非 ascii 字符时特别有用),但即使你有反斜杠,它也会被忽略。

Here's something that might work better for you for generating the list of filenames to exclude:以下内容可能更适合您生成要排除的文件名列表:

git log -z --all --name-only -m --pretty= '*\\*' | tr '\0' '\n' | sort -u >/opt/git_repo_files_w_escape.txt

but it assumes you do not have filenames with newline characters.但它假定您没有带换行符的文件名。 (If you do have files with newline characters, though, then --paths-from-file won't work for you.) (但是,如果您确实有带有换行符的文件,那么 --paths-from-file 将不适合您。)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从 GIT 回购历史中删除所有文件,路径中有冒号:文件名? - How to remove all files from GIT repo history with path having colon : in filename? 使用 git filter-repo 清理 git 历史记录 - Using git filter-repo to clean up git history 如何使用 git filter-repo 修改远程历史记录? - How to modify remote history with git filter-repo? 是否可以使用 `git filter-repo` 从 git 存储库中删除特定版本的文件? - Is it possible to use `git filter-repo` to remove a specific version of a file from a git repository? 从 git repo 和提交历史中递归删除所有二进制文件 - remove all binary files recursively from git repo and commit history Git filter-repo - 将文件添加到根提交失败 - Git filter-repo - failed on add files to root commit 如何使用 git filter-repo 仅修改一系列提交而不是整个分支历史记录? - How to modify only a range of commits with git filter-repo instead of the entire branch history? 使用 git filter-repo 重写历史时如何保持提交哈希不变 - How to keep commit hashs not change when use git filter-repo rewrite the history 在 `git git filter-repo -> git pull --allow-unrelated-histories` 之后有什么方法可以找回本地历史记录? - any way to get back local history after `git git filter-repo -> git pull --allow-unrelated-histories`? Git filter-branch 或 filter-repo 来更新子模块 gitlink? - Git filter-branch or filter-repo to update submodule gitlink?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM