简体   繁体   中英

Search Binary File for a Pattern

I need to search for a binary pattern in binary file, how can i do it?

I tried with "strstr()" function and convert the file and the pattern to a string but its not working.

(the pattern is also a binary file) this is what it tried:

void isinfected(FILE *file, FILE *sign, char filename[], char filepath[])
{
char* fil,* vir;
int filelen, signlen;
fseek(file, 0, SEEK_END);
fseek(sign, 0, SEEK_END);
filelen = ftell(file);
signlen = ftell(sign);

fil = (char *)malloc(sizeof(char) * filelen);
if (!fil)
{
    printf("unseccesful malloc!\n");
}

vir = (char *)malloc(sizeof(char) * signlen);

if (!vir)
{
    printf("unseccesful malloc!\n");
}

fseek(file, 0, SEEK_CUR);
fseek(sign, 0, SEEK_CUR);

fread(fil, 1, filelen, file);
fread(vir, 1, signlen, sign);
if (strstr(vir, fil) != NULL)
    log(filename, "infected",filepath );
else
    log(filename, "not infected", filepath);
free(vir);
free(fil);
}

For any binary handling you should never use one of the strXX functions, because these only (and exclusively) work on C-style zero terminated strings. Your code is failing because the strXX functions cannot look beyond the first binary 0 they encounter.

As your basic idea with strstr appears correct (and only fails because it works on zero terminated strings only), you can replace it with memmem , which does the same on arbitrary data. Since memmem is a GNU C extension (see also Is there a particular reason for memmem being a GNU extension? ), it may not be available on your system and you need to write code that does the same thing.

For a very basic implementation of memmem you can use memchr to scan for the first binary character, followed by memcmp if it found something:

void * my_memmem(const void *big, size_t big_len, const void *little, size_t little_len)
{
    void *iterator;
    if (big_len < little_len)
        return NULL;

    iterator = (void *)big;
    while (1)
    {
        iterator = memchr (iterator, ((unsigned char *)little)[0], big_len - (iterator-big));
        if (iterator == NULL)
            return NULL;
        if (iterator && !memcmp (iterator, little, little_len))
            return iterator;
        iterator++;
    }
}

There are better implementations possible, but unless memmem is an important function in your program, it'll do the job just fine.

The basic idea is to check if vir matches the beginning of fil . If it doesn't, then you check again, starting at the second byte of fil , and repeating until you find a match or until you've reached the end of fil . (This is essentially what a simple implementation of strstr does, except that strstr treats 0 bytes as a special case.)

int i;
for (i = 0; i < filelen - signlen; ++i) {
  if (memcmp(vir, fil + i, signlen) == 0) {
    return true;   // vir exists in fil found
  }
}
return false;  // vir is not in file

This is the "brute force" approach. It can get very slow if your files are long. There are advanced searching algorithms that can potentially make this much faster, but this is a good starting point.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM