简体   繁体   English

如何在目录中查找二进制文件?

[英]How to find binary files in a directory?

I need to find the binary files in a directory.我需要在目录中找到二进制文件。 I want to do this with file, and after that I will check the results with grep.我想用文件来做这个,然后我会用 grep 检查结果。 But my problem is that I have no idea what is a binary file.但我的问题是我不知道什么是二进制文件。 What will give the file command for binary files or what should I check with grep?什么会给二进制文件的文件命令,或者我应该用 grep 检查什么?

This finds all non-text based, binary, and empty files.这将查找所有非基于文本的二进制文件和空文件。

Edit编辑

Solution with only grep (from Mehrdad's comment):只有grep解决方案(来自 Mehrdad 的评论):

grep -rIL .

Original answer原答案

This does not require any other tool except find and grep :除了findgrep这不需要任何其他工具:

find . -type f -exec grep -IL . "{}" \;

-I tells grep to assume binary files as unmatched -I告诉 grep 假设二进制文件不匹配

-L prints only unmatched files -L只打印不匹配的文件

. matches anything else匹配其他任何东西


Edit 2编辑 2

This finds all non-empty binary files:这将查找所有非空二进制文件:

find . -type f ! -size 0 -exec grep -IL . "{}" \;

Just have to mention Perl 's -T test for text files, and its opposite -B for binary files.只需要提及Perl对文本文件的-T测试,以及对二进制文件的相反-B测试。

$ find . -type f | perl -lne 'print if -B'

will print out any binary files it sees.将打印出它看到的任何二进制文件。 Use -T if you want the opposite: text files.如果您想要相反的内容,请使用-T :文本文件。

It's not totally foolproof as it only looks in the first 1,000 characters or so, but it's better than some of the ad-hoc methods suggested here.它并非完全万无一失,因为它只显示前 1,000 个字符左右,但它比此处建议的一些特别方法要好。 See man perlfunc for the whole rundown.有关整个纲要,请参阅man perlfunc Here is a summary:这是一个总结:

The "-T" and "-B" switches work as follows. “-T”和“-B”开关的工作方式如下。 The first block or so of the file is examined to see if it is valid UTF-8 that includes non-ASCII characters.检查文件的第一个块左右,看它是否是包含非 ASCII 字符的有效 UTF-8。 If, so it's a "-T" file.如果,那么它是一个“-T”文件。 Otherwise, that same portion of the file is examined for odd characters such as strange control codes or characters with the high bit set.否则,将检查文件的同一部分是否有奇数字符,例如奇怪的控制代码或设置了高位的字符。 If more than a third of the characters are strange, it's a "-B" file;如果超过三分之一的字符是奇怪的,那就是“-B”文件; otherwise it's a "-T" file.否则它是一个“-T”文件。 Also, any file containing a zero byte in the examined portion is considered a binary file.此外,任何在检查部分包含零字节的文件都被视为二进制文件。

My first answer to the question fell pretty much inline here using the find command.我对这个问题的第一个答案在这里使用find命令几乎是内联的。 I think your instructor was looking to get you into the concept of magic numbers using the file command, which breaks them down into multiple types.我认为您的讲师希望使用file命令让您了解magic numbers的概念,该命令将它们分解为多种类型。

For my purposes, it was as simple as:就我而言,它很简单:

file * | grep executable

But it can be done in numerous ways.但它可以通过多种方式完成。

As this is an assignment, you would probably hate me if I gave you the complete solution ;-) So here is a little hint:由于这是一项任务,如果我给你完整的解决方案,你可能会讨厌我;-) 所以这里有一点提示:

The grep command will output a list of binary files per default, if you search for a regular expression like . grep命令默认会输出一个二进制文件列表,如果你搜索像. that will match on any non-empty file:这将匹配任何非空文件:

grep . *

Output:输出:

[...]
Binary file c matches
Binary file e matches

You can use awk to get the filenames only and ls to print the permissions.您可以使用awk仅获取文件名,使用ls打印权限。 See the respective man pages ( man grep , man awk , man ls ).请参阅相应的手册页( man grepman awkman ls )。

In these modern times ( 2020 is practically the 3rd decade of the 21st century after all), I think the correct question is how do I find all the non-utf-8 files ?在这些现代(毕竟2020 年实际上是 21 世纪的第 3 个十年),我认为正确的问题是如何找到所有非 utf-8 文件 Utf-8 being the modern equivalent of a text file. UTF-8 是现代文本文件的等价物。

utf-8 encoding of text with non-ascii code points will introduce non-ascii bytes (ie, bytes with the most significant bit set).具有非 ascii 代码点的文本的 utf-8 编码将引入非 ascii 字节(即设置了最高有效位的字节)。 Now, not all sequences of such bytes form valid utf-8 sequences.现在,并非所有此类字节的序列都形成有效的 utf-8 序列。

isutf8 from the moreutils package is what you need. moreutils包中的isutf8正是您所需要的。

$ isutf8 -l /bin/*
/bin/[
/bin/acyclic
/bin/addr2line
/bin/animate
/bin/applydeltarpm
/bin/apropos
⋮

A quick check:快速检查:

$ file $(isutf8 -l /bin/*)
/bin/[:             ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=4d70c2142fc672d8a69d033ecb6693ec15b1e6fb, for GNU/Linux 3.2.0, stripped
/bin/acyclic:       ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=d428ea52eb0e8aaf7faf30914710d8fbabe6ca28, for GNU/Linux 3.2.0, stripped
/bin/addr2line:     ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=797f42bc4f8fb754a49b816b82d6b40804626567, for GNU/Linux 3.2.0, stripped
/bin/animate:       ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=36ab46e69c1bfea433382ffc9bbd9708365dac2b, for GNU/Linux 3.2.0, stripped
/bin/applydeltarpm: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=a1fddcbeec9266e698782596f2dfd1b4f3e0b974, for GNU/Linux 3.2.0, stripped
/bin/apropos:       symbolic link to whatis
⋮

You may wish to invert the test and get all the text files.您可能希望反转测试并获取所有文本文件。 Use -i :使用-i

$ isutf8 -il /bin/*
/bin/alias
/bin/bashbug
/bin/bashbug-64
/bin/bg
⋮
$ file -L $(isutf8 -il /bin/*)
/bin/alias:      a /usr/bin/sh script, ASCII text executable
/bin/bashbug:    a /usr/bin/sh - script, ASCII text executable, with very long lines
/bin/bashbug-64: a /usr/bin/sh - script, ASCII text executable, with very long lines
/bin/bg:         a /usr/bin/sh script, ASCII text executable
⋮

Yeah, it reads the whole file, but it's pretty speedy, and if you want accuracy…是的,它读取整个文件,但速度非常快,如果你想要准确性......

I need to find the binary files in a directory.我需要在目录中找到二进制文件。 I want to do this with file, and after that I will check the results with grep.我想使用文件执行此操作,然后我将使用grep检查结果。 But my problem is that I have no idea what is a binary file.但是我的问题是我不知道什么是二进制文件。 What will give the file command for binary files or what should I check with grep?什么将给二进制文件file命令,或者我应该使用grep检查什么?

I think the best tool to determine the nature of a file is the file utility.我认为确定文件性质的最佳工具是文件实用程序。 In one of my directories I have only one file identified as binary by the nautilus file manager.在我的一个目录中,我只有一个文件被 nautilus 文件管理器识别为二进制文件。 For this file only, the command ls |仅对于此文件,命令ls | xargs file returns "data" without any further information. xargs 文件返回“数据”而没有任何进一步的信息。

Binary files in linux have the format of ELF linux中二进制文件的格式为ELF

When you run file command on a binary file, then the output contains the word ELF .当您对二进制文件运行file命令时,输出包含单词ELF You can grep this.你可以grep这个。

On command line:在命令行上:

file <binary_file_name>

So, if you want to find the binary files inside a directory (in linux for example), you can do something like this:所以,如果你想在一个目录中找到二进制文件(例如在 linux 中),你可以这样做:

ls | xargs file | grep ELF

You can use find and the parameter -executable that is basically what you want.您可以使用find和参数-executable这基本上就是您想要的。

The manpages says:手册页说:

   -executable
          Matches files which are executable and directories which are searchable (in a file name resolution sense).  This takes into  account  access control lists and other permissions artefacts which the -perm test ignores.  This test makes use of the access(2) system call, and so can be fooled by NFS servers which do UID mapping (or root-squashing), since many systems implement access(2) in the client's kernel and so  cannot make  use  of  the  UID mapping information held on the server.  Because this test is based only on the result of the access(2) system call, there is no guarantee that a file for which this test succeeds can actually be executed.

This is a result of what you want:这是你想要的结果:

# find /bin  -executable -type f | grep 'dmesg'
/bin/dmesg

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM