简体   繁体   English

在二进制数据中查找字符串

[英]Find Character String In Binary Data

I have a binary file I've loaded using an NSData object. 我有一个使用NSData对象加载的二进制文件。 Is there a way to locate a sequence of characters, 'abcd' for example, within that binary data and return the offset without converting the entire file to a string? 有没有办法在二进制数据中找到一系列字符,例如'abcd',并返回偏移而不将整个文件转换为字符串? Seems like it should be a simple answer, but I'm not sure how to do it. 似乎它应该是一个简单的答案,但我不知道该怎么做。 Any ideas? 有任何想法吗?

I'm doing this on iOS 3 so I don't have -rangeOfData:options:range: available. 我在iOS 3上这样做,所以我没有-rangeOfData:options:range: available。

I'm going to award this one to Sixteen Otto for suggesting strstr. 我要把这个奖励给Sixteen Otto,以便建议strstr。 I went and found the source code for the C function strstr and rewrote it to work on a fixed length Byte array--which incidentally is different from a char array as it is not null terminated. 我去找了C函数strstr的源代码并重写它以在固定长度的Byte数组上工作 - 顺便说一下,它与char数组不同,因为它不是null终止的。 Here is the code I ended up with: 这是我最终得到的代码:

- (Byte*)offsetOfBytes:(Byte*)bytes inBuffer:(const Byte*)buffer ofLength:(int)len;
{
    Byte *cp = bytes;
    Byte *s1, *s2;

    if ( !*buffer )
        return bytes;

    int i = 0;
    for (i=0; i < len; ++i)
    {
        s1 = cp;
        s2 = (Byte*)buffer;

        while ( *s1 && *s2 && !(*s1-*s2) )
            s1++, s2++;

        if (!*s2)
            return cp;

        cp++;
    }

    return NULL;
}

This returns a pointer to the first occurrence of bytes, the thing I'm looking for, in buffer, the byte array that should contain bytes. 这将返回一个指针,指向第一次出现的字节,我正在寻找的东西,在缓冲区中,应该包含字节的字节数组。

I call it like this: 我称之为:

// data is the NSData object
const Byte *bytes = [data bytes];
Byte* index = [self offsetOfBytes:tag inBuffer:bytes ofLength:[data length]];

Convert your substring to an NSData object, and search for those bytes in the larger NSData using rangeOfData:options:range: . 将您的子字符串转换为NSData对象,并使用rangeOfData:options:range:在较大的NSData搜索这些字节。 Make sure that the string encodings match! 确保字符串编码匹配!

On iPhone, where that isn't available, you may have to do this yourself. 在iPhone上,如果没有,您可能必须自己做。 The C function strstr() will give you a pointer to the first occurrence of a pattern within the buffer (as long as neither contain nulls!), but not the index. C函数strstr()将为您提供指向缓冲区中第一次出现模式的指针(只要它们都不包含空值!),而不是索引。 Here's a function that should do the job (but no promises, since I haven't tried actually running it...): 这是一个应该完成工作的功能(但没有承诺,因为我还没有尝试过实际运行它......):

- (NSUInteger)indexOfData:(NSData*)needle inData:(NSData*)haystack
{
    const void* needleBytes = [needle bytes];
    const void* haystackBytes = [haystack bytes];

    // walk the length of the buffer, looking for a byte that matches the start
    // of the pattern; we can skip (|needle|-1) bytes at the end, since we can't
    // have a match that's shorter than needle itself
    for (NSUInteger i=0; i < [haystack length]-[needle length]+1; i++)
    {
        // walk needle's bytes while they still match the bytes of haystack
        // starting at i; if we walk off the end of needle, we found a match
        NSUInteger j=0;
        while (j < [needle length] && needleBytes[j] == haystackBytes[i+j])
        {
            j++;
        }
        if (j == [needle length])
        {
            return i;
        }
    }
    return NSNotFound;
}

This runs in something like O(nm), where n is the buffer length, and m is the size of the substring. 这类似于O(nm),其中n是缓冲区长度,m是子串的大小。 It's written to work with NSData for two reasons: 1) that's what you seem to have in hand, and 2) those objects already encapsulate both the actual bytes, and the length of the buffer. 它被编写为与NSData一起工作有两个原因:1)这就是你手边的东西,2)这些对象已经封装了实际的字节和缓冲区的长度。

If you're using Snow Leopard, a convenient way is the new -rangeOfData:options:range: method in NSData that returns the range of the first occurrence of a piece of data. 如果您正在使用Snow Leopard,一种方便的方法是NSData中的new -rangeOfData:options:range:方法,它返回第一次出现的数据的范围。 Otherwise, you can access the NSData's contents yourself using its -bytes method to perform your own search. 否则,您可以使用其-bytes方法自行访问NSData的内容以执行您自己的搜索。

I had the same problem. 我有同样的问题。 I solved it doing the other way round, compared to the suggestions. 与建议相比,我反过来解决了这个问题。

first, I reformat the data (assume your NSData is stored in var rawFile) with: 首先,我重新格式化数据(假设您的NSData存储在var rawFile中):

NSString *ascii = [[NSString alloc] initWithData:rawFile encoding:NSAsciiStringEncoding];

Now, you can easily do string searches like 'abcd' or whatever you want using the NSScanner class and passing the ascii string to the scanner. 现在,您可以使用NSScanner类轻松地执行字符串搜索,例如'abcd'或任何您想要的内容,并将ascii字符串传递给扫描程序。 Maybe this is not really efficient, but it works until the -rangeOfData method will be available for iPhone also. 也许这不是很有效,但它可以工作,直到-rangeOfData方法也可用于iPhone。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM