简体   繁体   English

通过fget /缓冲区字符串解码二进制文件(尝试获取mp3标头)

[英]Decoding Binary via fget / buffer string (Trying to get mp3 header)

I'm writing some quick code to try and extract data from an mp3 file header. 我正在编写一些快速代码,以尝试从mp3文件头中提取数据。

The objective is to extract information from the header such as the bitrate and other vital information so that I can appropriately stream the file to a mp3decoder with the necessary arguments. 目的是从标头中提取信息,例如比特率和其他重要信息,以便我可以使用必要的参数将文件适当地流式传输到mp3decoder。

Here is a wikipedia image showing the mp3header information: http://upload.wikimedia.org/wikipedia/commons/0/01/Mp3filestructure.svg 这是显示mp3header信息的Wikipedia图像: http : //upload.wikimedia.org/wikipedia/commons/0/01/Mp3filestructure.svg

My question is, am I attacking this correctly? 我的问题是,我是否正确地对此进行了攻击? Printing the data received is worthless -- I just get a bunch of random characters. 打印接收到的数据毫无价值-我只是得到一堆随机字符。 I need to get to the binary so that I can decode it and determine vital information. 我需要获取二进制文件,以便可以对其进行解码并确定重要信息。

Here is my baseline code: 这是我的基准代码:

// mp3 Header File IO.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include "stdio.h"
#include "string.h"
#include "stdlib.h"

// Main function
int main (void)
{
    // Declare variables
    FILE *mp3file;
    char *mp3syncword; // we will need to allocate memory to this!!
    char requestedFile[255] = "";
    unsigned long fileLength;

    // Counters
    int i;

    // Memory allocation with malloc
    mp3syncword=(char *)malloc(2000);

    // Let's get the name of the requested file (hard-coded for now)
    strcpy(requestedFile,"testmp3.mp3");

    // Open the file with mode read, binary
    mp3file = fopen(requestedFile, "rb"); 
    if (!mp3file){
         // If we can't find the file, notify the user of the problem
         printf("Not found!");
    }

    // Let's get some header data from the file
    fseek(mp3file,1,SEEK_SET);
    fread(mp3syncword,32,1,mp3file);

    // For debug purposes, lets print the received data
     for(i = 0; i < 32; ++i)
        printf("%c", ((char *)mp3syncword)[i]);
    enter code here
    return 0;
}

Help appreciated. 帮助表示赞赏。

You are printing the bytes out using %c as the format specifier. 您正在使用%c作为格式说明符来打印字节。 You need to use an unsigned numeric format specifier (eg %u for a decimal number or %x or %X for hexadecimal) to print the byte values. 您需要使用无符号数字格式说明符(例如, %u代表十进制数字, %x%X代表十六进制)来打印字节值。

You should also declare your byte arrays as unsigned char as they are signed by default on Windows. 您还应该将字节数组声明为unsigned char因为它们在Windows上是默认签名的。

You might also want to print out a space (or other separator) after each byte value to make the output clearer. 您可能还希望在每个字节值之后打印一个空格(或其他分隔符),以使输出更清晰。

The standard printf does not provide a binary representation type specifier. 标准printf不提供二进制表示类型说明符。 Some implementations do have this but the version supplied with Visual Studio does not. 某些实现确实具有此功能,但Visual Studio随附的版本则没有。 In order to output this you will need to perform bit operations on the number to extract the individual bits and print each of them in turn for each byte. 为了输出此信息,您将需要对数字执行位运算以提取单个位并针对每个字节依次打印每个位。 For example: 例如:

unsigned char byte = // Read from file
unsigned char mask = 1; // Bit mask
unsigned char bits[8];

// Extract the bits
for (int i = 0; i < 8; i++) {
    // Mask each bit in the byte and store it
    bits[i] = (byte & (mask << i)) >> i;
}

// The bits array now contains eight 1 or 0 values
// bits[0] contains the least significant bit
// bits[7] contains the most significant bit

C does not have a printf() specifier to print in binary. C没有以二进制形式打印的printf()说明符。 Most people print in hex instead, which will give you (typically) eight bits at a time: 多数人改为以十六进制打印,这一次(通常)会给您八位:

printf("the first eight bits are %02x\n", (unsigned char) mp3syncword[0]);

You will need to interpret this manually to figure out the values of individual bits. 您将需要手动解释以找出各个位的值。 The cast to unsigned char on the argument is to avoid surprises if it's negative. 如果参数为负,则将其强制转换为unsigned char以避免意外。

To test bits, you can use use the & operator together with the bitwise left shift operator, << : 要测试位,可以将&运算符与按位左移运算符<<一起使用:

if(mp3syncword[2] & (1 << 2))
{
  /* The third bit from the right of the third byte was set. */
}

If you want to be able to use "big" (larger than 7) indexes for bits, ie treat the data as a 32-bit word, it might be good to read it into eg an unsigned int , and then inspect that. 如果您希望对位使用“大”(大于7)索引,即将数据视为32位字,则最好将其读入例如unsigned int ,然后进行检查。 Be careful with endian-ness when you do this reading, however. 但是,在阅读本文时,请注意字节序。

Warning : there are probably errors with memory layout and/or endianess with this approach. 警告 :这种方法的内存布局和/或字节序可能存在错误。 It is not guaranteed that the struct members match the same bits from computer to computer. 不能保证结构成员在计算机之间匹配相同的位。
In short: don't rely on this (I'll leave the answer, it might be useful for something else) 简而言之:不要依赖于此(我会留下答案,这可能对其他事情很有用)

You can define a struct with bit fields: 您可以使用位字段定义结构:

struct MP3Header {
    unsigned SyncWord : 12;
    unsigned Version : 1;
    unsigned Layer : 2;
    unsigned ErrorProtection : 1;
    unsigned BitRate : 4;
    unsigned Frequency : 2;
    unsigned PadBit : 1;
    unsigned PrivBit : 1;
    unsigned Mode : 2;
    unsigned ModeExtension : 2;
    unsigned Copy : 1;
    unsigned Original : 1;
    unsigned Emphasis : 2;
};

and then use each member as an isolated value: 然后将每个成员用作隔离值:

struct MP3Header h;
/* ... */
fread(&h, sizeof h, 1, mp3file); /* error check!! */
printf("Frequency: %u\n", h.Frequency);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM