简体   繁体   中英

C++ - Reading in 16bit .wav files

I'm trying to read in a .wav file, which I thought was giving me the correct result, however, when I plot the same audio file in Matlab or Python, the results are different.

This is the result that I get:

在此输入图像描述

This is the result that Python (plotted with matplotlib) gives:

在此输入图像描述

The results do not seem that different, but, when it comes to analysis, this is messing up my results.

Here is the code that converts:

for (int i = 0; i < size; i += 2)
{
    int c = (data[i + 1] << 8) | data[i];
    double t = c/32768.0;
                //cout << t << endl;
    rawSignal.push_back(t);
}

Where am I going wrong? Since, this conversion seems fine and does produce such a similar results.

Thanks

EDIT:

Code to read the header / data:

voidreadHeader(ifstream& file) {



 s_riff_hdr riff_hdr;
    s_chunk_hdr chunk_hdr;

    long padded_size; // Size of extra bits

    vector<uint8_t> fmt_data; // Vector to store the FMT data.

    s_wavefmt *fmt = NULL;

    file.read(reinterpret_cast<char*>(&riff_hdr), sizeof(riff_hdr));
    if (!file) return false;

    if (memcmp(riff_hdr.id, "RIFF", 4) != 0) return false;

    //cout << "size=" << riff_hdr.size << endl;
    //cout << "type=" << string(riff_hdr.type, 4) << endl;

    if (memcmp(riff_hdr.type, "WAVE", 4) != 0) return false;
    {
         do
         {
            file.read(reinterpret_cast<char*>(&chunk_hdr), sizeof(chunk_hdr));
            if (!file) return false;
            padded_size = ((chunk_hdr.size + 1) & ~1);

            if (memcmp(chunk_hdr.id, "fmt ", 4) == 0) 
            {
                if (chunk_hdr.size < sizeof(s_wavefmt)) return false;

                fmt_data.resize(padded_size);
                file.read(reinterpret_cast<char*>(&fmt_data[0]), padded_size);
                if (!file) return false;

                fmt = reinterpret_cast<s_wavefmt*>(&fmt_data[0]);

                sample_rate2 = fmt->sample_rate;

                if (fmt->format_tag == 1) // PCM
                {
                    if (chunk_hdr.size < sizeof(s_pcmwavefmt)) return false;

                    s_pcmwavefmt *pcm_fmt = reinterpret_cast<s_pcmwavefmt*>(fmt);


                    bits_per_sample = pcm_fmt->bits_per_sample;
                }
                else
                {
                    if (chunk_hdr.size < sizeof(s_wavefmtex)) return false;

                    s_wavefmtex *fmt_ex = reinterpret_cast<s_wavefmtex*>(fmt);


                    if (fmt_ex->extra_size != 0)
                    {
                        if (chunk_hdr.size < (sizeof(s_wavefmtex) + fmt_ex->extra_size)) return false;

                        uint8_t *extra_data = reinterpret_cast<uint8_t*>(fmt_ex + 1);
                        // use extra_data, up to extra_size bytes, as needed...
                    }

                }
                //cout << "extra_size=" << fmt_ex->extra_size << endl;
            }

            else if (memcmp(chunk_hdr.id, "data", 4) == 0)
            {
                // process chunk data, according to fmt, as needed...
                size = padded_size;

                if(bits_per_sample == 16)
                {
                    //size = padded_size / 2;
                }

                data = new unsigned char[size];

                file.read(data,     size);

                file.ignore(padded_size);
                if (!file) return false;
            }
            {
                // process other chunks as needed...

                file.ignore(padded_size);
                if (!file) return false;
            }

        }while (!file.eof());
         return true;
     }

 }

This is where the "conversion to double" happens :

if(bits_per_sample == 8)
        {
            uint8_t c;  
            //cout << size;
            for(unsigned i=0; (i < size); i++)
            {
                c = (unsigned)(unsigned char)(data[i]);
                double t = (c-128)/128.0;
                rawSignal.push_back(t);
            }
        }
        else if(bits_per_sample == 16)
        {

            for (int i = 0; i < size; i += 2)
            {
                int c;
                c = (unsigned) (unsigned char) (data[i + 2] << 8) | data[i];
                double t = c/32768.0;
                rawSignal.push_back(t);
        }

Note how "8bit" files work correctly?

I suspect your problem may be that data is an array of signed char values. So, when you do this:

int c = (data[i + 1] << 8) | data[i];

… it's not actually doing what you wanted. Let's look at some simple examples.

If data[i+1] == 64 and data[i] == 64 , that's going to be 0x4000 | 0x40, or 0x4040, all good.

If data[i+1] == -64 and data[i] == -64 , that's going to be 0xffffc000 | 0xffffffc0, or 0xffffffc0, which is obviously wrong.

If you were using unsigned char values, this would work, because instead of -64 those numbers would be 192, and you'd end up with 0xc000 | 0xc0 or 0xc0c0, just as you want. (But then your /32768.0 would give you numbers in the range 0.0 to 2.0, when you presumably want -1.0 to 1.0.)

Suggesting a "fix" is difficult without knowing what exactly you're trying to do. Obviously you want to convert some kind of 16-bit little-endian integer format into some kind of floating-point format, but a lot rests on the exact details of those formats, and you haven't provided any such details. The default .wav format is 16-bit unsigned little-endian integers, so just using unsigned char * would fix that part of the equation. But I don't know of any audio format that uses 64-bit floating point numbers from 0.0 to 2.0, and I don't know what audio format you're actually aiming for, so I can't say what that /32768.0 should actually be, just that it's probably wrong.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM