[英]is there a way to check that the file data is encrypted or not?

I want to check if file is encrypted or not, is there any better way to check if we can detect encrypted or not.我想检查文件是否加密,有没有更好的方法来检查我们是否可以检测到加密。

  • I have used shannon entropy to check files entropy, by doing this we can find the entropies of file which will lead to conclusion that data is either encrypted, compressed or have random data because if entropy of file is high, it can state these three conditions我使用香农熵来检查文件熵,通过这样做我们可以找到文件的熵,这将得出数据被加密、压缩或具有随机数据的结论,因为如果文件的熵很高,它可以 state 这三个条件
  • But, it can not distinguish between compressed and encrypted file as high entropy can lead to any of these conditons但是,它无法区分压缩文件和加密文件,因为高熵可能导致任何这些情况

How can we detect that file is encrypted or not?我们如何检测该文件是否已加密?

Theoretically, you can't tell the difference between random data, encrypted data, and data that has been maximally compressed.从理论上讲,您无法区分随机数据、加密数据和最大程度压缩的数据。

In real life, though, compressed and encrypted data are encoded in file formats with headers and other low-entropy regions that you can use to recognize them.但在现实生活中,压缩和加密数据以文件格式编码,带有标头和其他可用于识别它们的低熵区域。

A pretty good implementation would be to look for a whole bunch of tags that you know, and then use a rule of thumb:一个很好的实现是寻找一大堆你知道的标签,然后使用经验法则:

  • Small bit of low-entropy stuff at the beginning => encrypted, while开始时有少量低熵的东西 => 加密,而
  • Long bit of low entropy stuff at the beginning (like jpeg), or low entropy stuff at the end (like zip), or many little low entropy bits in between (like audio/video) => compressed.开头有很多低熵的东西(比如 jpeg),或者结尾有很多低熵的东西(比如 zip),或者中间有很多低熵的小东西(比如音频/视频)=> 压缩。

Also, the overall entropy of compressed data, if you measure it with bigrams or trigrams, will not be as high as encrypted data, because compression is never perfect.此外,压缩数据的整体熵,如果用二元组或三元组来衡量,将不会像加密数据那样高,因为压缩从来都不是完美的。

