简体   繁体   English

如何有效地移动数字中的小数点直到达到某个阈值

[英]How to efficiently move the decimal point in a number until reaching some threshold

Say I have a double x, whose value is > 0 and < 1 million. 假设我有一个双精度x,其值> 0且<100万。 I want to move its decimal point left until it is > 1 million and < 10 million. 我想将其小数点向左移动,直到> 100万且<1000万。 So for example, 23.129385 becomes 2313938.5. 因此,例如23.129385变为2313938.5。

What I'm doing now is just multiplying by 10 until reaching the stopping condition. 我现在所做的只是乘以10,直到达到停止状态。 However I'm performing this algorithm a lot so if I can optimize it somehow it would be helpful. 但是,我经常执行此算法,因此,如果我可以以某种方式对其进行优化,将会很有帮助。 A constant time solution, irrelevant of the magnitude of x, is obviously ideal but so far I haven't been able to come up with one. 与x的大小无关的恒定时间解决方案显然是理想的,但是到目前为止,我还无法提出一个解决方案。

Some languages, such as C++ with frexp , expose the binary exponent as an integer very cheaply. 某些语言(例如带有frexp的 C ++)非常便宜地将二进制指数公开为整数。

If you are so lucky you can have a precomputed lookup table pow2to10 from the 2k possible binary exponents to the power of 10 that it could be. 如果您很幸运,可以有一个预先计算的查找表pow2to10从2k可能的二进制指数到10的幂。 Have another lookup table lookup10 for the powers of 10. Now your computation looks like: 再有一个10的幂的查找表lookup10

frexp(x , &n);
int i = pow2to10[n];
if (lookup10[i+1] <= x) {
    i++;
}
double result = x * lookup10[i];

Now instead of a series of multiplications, you have 3 array lookups, one comparison and one multiplication. 现在,您将拥有3个数组查找,一个比较和一个乘法,而不是一系列乘法。 If you are executing this in a tight loop, store pow2to10 as an array of short int , try to trim the ranges to what you need, and the lookups will be in a data structure that can fit in L1 cache. 如果要在一个紧密的循环中执行此操作,请将pow2to10存储为short int数组,尝试将范围调整为所需的范围,并且查找将位于可适合L1缓存的数据结构中。

If you are not so lucky, you can instead of repeatedly multiplying, just compare against an array of known powers of 10. Be warned that if you've got a high level language, you may find that the overhead of running instructions beats the savings of comparison vs multiply. 如果您不是很幸运,您可以重复乘以10,而不是重复乘以,而是与一系列已知的10的幂进行比较。请注意,如果您使用的是高级语言,则可能会发现执行指令的开销超过了节省额。比较与乘法。 It may be tempting to do a binary search to do less lookups, but I would bet on linear search being better because that helps branch prediction. 进行二进制搜索以减少查找可能很诱人,但是我敢打赌线性搜索会更好,因为这有助于分支预测。

You don't say which language or which type of CPU, or how the numbers are distributed (eg if most of them are less than 5 but rarely a few are large, or..); 您没有说哪种语言或哪种类型的CPU,或者数字的分布方式(例如,如果大多数数字小于5,但很少有大数字,或者..); however... 然而...

The fastest scalar version I can think of (assuming C and modern 80x86 CPUs maybe) is: 我能想到的最快的标量版本(假设是C和现代的80x86 CPU)是:

    // x is between 1 and 999999

    unsigned long x_int = x;        // Integer comparisons are possibly faster
    double multiplier;

    if(x_int < 1000) {
        // x is between 1 and 999
        if(x_int < 100) {
            // x is between 1 and 99
            if(x_int < 10) {
                // x is between 1 and 9
                multiplier = 1000000;
            } else {
                // x is between 10 and 99
                multiplier = 100000;
            }
        } else {
            // x is between 100 and 999
            multiplier = 10000;
        }
    } else {
        // x is between 1000 and 999999
        if(x_int < 10000) {
            // x is between 1000 and 9999
            multiplier = 1000;
        } else {
            // x is between 10000 and 999999
            if(x_int < 100000) {
                // x is between 10000 and 99999
                multiplier = 100;
            } else {
                // x is between 100000 and 999999
                multiplier = 10;
            }
        }
    }
    x *= multiplier;

This adds up to 2 or 3 branches and one multiplication per value. 最多可添加2或3个分支,每个值一个乘法。 Note: for modern 80x86 the final branch can be replaced with a CMOVcc instruction. 注意:对于现代80x86,可以使用CMOVcc指令替换最终分支。

If you're doing this a lot; 如果您经常这样做; then the next step would be to try to use SIMD to do multiple values at the same time (followed by multi-threading/multi-CPU). 那么下一步将是尝试使用SIMD同时执行多个值(随后是多线程/多CPU)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM