简体   繁体   中英

Calculate product of floating point numbers with fixed point arithmetic

I have two jobs to do:

  1. I need to store two sensor values in memory (using a FPU)
  2. and then calculate the product of those two values without a FPU.

My sensors are:

  • Sensor 1 gives values between 0 and 40 with a resolution of 0.2
  • Sensor 2 gives values between -10 and 10 with a resolution of 0.1

To store the data in a efficient way, I converted the values to integers by using an offset and a scaling factor

uint8 uint_value = (floating_value - offset) / resolution;

Sample data: Sensor 1:

Real World Value 0.0 -> uint_value = 0
Real World Value 20.2 -> uint_value = 101

Sensor 2:

Real World Value -10.0 -> uint_value = 0
Real World Value 0.5 -> uint_value = 105

Now I'm having some problems with the second task. I need to calculate the product of those two values without using floating point arithmetic.

How do I do this? I have had a look in fixed point numbers, which offer a possibility to do a multiplication by integer multiplication and shifting. But I can't seem to figure out how to convert my scaled values to fixed point numbers.

执行整数乘法,然后将小数位数乘以小数位和。

The problem you are running in to is, basically, that 0.2 and 0.1 cannot be represented exactly as binary fractions. You multiplied one value by 5 and the other by 10. At some point you need to divide their product by 50, which is not possible via a simple shift. Your calculations will not be exact, regardless of whether you use floating or fixed-point operations. You must ask yourself how much precision you require in the result.

To use fixed-point arithmetic, multiply each sensor value by a power of 2, say 256. Use a signed 16-bit variable and don't subtract any offset. The result of this multiplication will have the top 8 bits representing the integer part of the sensor reading and the bottom 8 bits representing an inexact fractional part of the sensor reading.

Multiply these two 16 bit values to obtain a signed, 32-bit result. The bottom sixteen bits represent the fractional part of the product. Now you can round or truncate those fractional bits to your desired level of precision. But you can't say your precision will be 0.02 because that's not a rational binary value. Your precision must be of the form 1/2^F where F is the number of fraction bits you retain in the final answer.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM