简体   繁体   中英

Fixed point multiplication

I need to convert a value from one unit to another according to a non constant factor. The input value range from 0 to 1073676289 and the range value range from 0 to 1155625. The conversion can be described like this:

output = input * (range / 1073676289)

My own initial fixed point implementation feels a bit clumsy:

// Input values (examples)
unsigned int input = 536838144;  // min 0, max 1073676289
unsigned int range = 1155625;    // min 0, max 1155625

// Conversion
unsigned int tmp = (input >> 16) * ((range) >> 3u);
unsigned int output = (tmp / ((1073676289) >> 16u)) << 3u;

Can my code be improved to be simpler or to have better accuracy?

这将为您提供没有浮点值的最佳精度,结果将四舍五入为最接近的整数值:

output = (input * (long long) range + 536838144) / 1073676289;

The problem is that input * range would overflow a 32-bit integer. Fix that by using a 64-bit integer.

uint64_least_t tmp;
tmp = input;
tmp = tmp * range;
tmp = tmp / 1073676289ul;
output = temp;

A quick trip out to google brought http://sourceforge.net/projects/fixedptc/ to my attention

It's ac library in a header for managing fixed point math in 32 or 64 bit integers.

A little bit of experimentation with the following code:

#include <stdio.h>
#include <stdint.h>

#define FIXEDPT_BITS        64

#include "fixedptc.h"

int main(int argc, char ** argv)
{
    unsigned int input = 536838144;  // min 0, max 1073676289
    unsigned int range = 1155625;    // min 0, max 1155625

    // Conversion
    unsigned int tmp = (input >> 16) * ((range) >> 3u);
    unsigned int output = (tmp / ((1073676289) >> 16u)) << 3u;

    double output2 = (double)input * ((double)range / 1073676289.0);

    uint32_t output3 = fixedpt_toint(fixedpt_xmul(fixedpt_fromint(input), fixedpt_xdiv(fixedpt_fromint(range), fixedpt_fromint(1073676289))));

    printf("baseline = %g, better = %d, library = %d\n", output2, output, output3);

    return 0;
}

Got me the following results:

baseline = 577812, better = 577776, library = 577812

Showing better precision (matching the floating point) than you were getting with your code. Under the hood it's not doing anything terribly complicated (and doesn't work at all in 32 bits)

/* Multiplies two fixedpt numbers, returns the result. */
static inline fixedpt
fixedpt_mul(fixedpt A, fixedpt B)
{
    return (((fixedptd)A * (fixedptd)B) >> FIXEDPT_FBITS);
}


/* Divides two fixedpt numbers, returns the result. */
static inline fixedpt
fixedpt_div(fixedpt A, fixedpt B)
{
    return (((fixedptd)A << FIXEDPT_FBITS) / (fixedptd)B);
}

But it does show that you can get the precision you want. You'll just need 64 bits to do it

You won't get it any simpler then output = input * (range / 1073676289)

As noted below in the comments if you are restircted to integer operations then for range < 1073676289 : range / 1073676289 == 0 so you would be good to go with:

output = range < 1073676289 ? 0 : input

If that is not what you wanted and you actually want precision then

output = (input * range) / 1073676289

will be the way to go.

If you need to do a lot of those then i suggest you use double and have your compiler vectorise your operations. Precision will be ok too.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM