简体   繁体   中英

Find filename that contain hex value?

I would like to correct a bad encoding for thousand files. The error is always the same, an unknown char should be replaced with a french é .

$ find . -type f | grep 127427
./documents/1778_commande_127427_accus�_de_r�ception.pdf

$ find . -type f | grep 127427 | hexdump -C
00000000  2e 2f 64 6f 63 75 6d 65  6e 74 73 2f 31 37 37 38  |./documents/1778|
00000010  5f 63 6f 6d 6d 61 6e 64  65 5f 31 32 37 34 32 37  |_commande_127427|
00000020  5f 61 63 63 75 73 ef bf  bd 5f 64 65 5f 72 ef bf  |_accus..._de_r..|
00000030  bd 63 65 70 74 69 6f 6e  2e 70 64 66 0a           |.ception.pdf.|
0000003d

So I am looking for ef bf bd which does not look like an unicode char. Unfortunately looking for the 0xef does not work:

$ find . -type f | grep -P '\xef'
(nothing)

Any clues?

Next I am planning to do something like:

$ find . -type f | grep <magic-here> | xargs -n1 -I{} sh -c 'mv "{}" $(echo "{}" | sed s/<magic-here>/é/) '

Like this:

echo $'\x2e\x2f\x64\x6f\x63\x75\x6d\x65\x6e\x74\x73\x2f\x31\x37\x37\x38\x5f\x63\x6f\x6d\x6d\x61\x6e\x64\x65\x5f\x31\x32\x37\x34\x32\x37\x5f\x61\x63\x63\x75\x73\xef\xbf\xbd\x5f\x64\x65\x5f\x72\xef\xbf\xbd\x63\x65\x70\x74\x69\x6f\x6e\x2e\x70\x64\x66\x0a'\
| grep -Fa $'\xef\xbf\xbd'

-a treats binary files as text. -F performs a fixed string search, no regular expressions. $'' is an ANSI string


The find command should look like this:

find ... -exec sed $'s/\xef\xbf\xbd/é/g' {} +

When you are sure that it works, use -i , this will change files in place:

find ... -exec sed -i $'s/\xef\xbf\xbd/é/g' {} +

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM