简体   繁体   English

将两个整数相除并四舍五入,不使用浮点数

[英]Dividing two integers and rounding up the result, without using floating point

I need to divide two numbers and round it up.我需要将两个数字相除并四舍五入。 Are there any better way to do this?有没有更好的方法来做到这一点?

int myValue = (int) ceil( (float)myIntNumber / myOtherInt );

I find an overkill to have to cast two different time.我发现不得不施放两个不同的时间是一种矫枉过正。 (the extern int cast is just to shut down the warning) (extern int cast 只是为了关闭警告)

Note I have to cast internally to float otherwise注意我必须在内部进行转换才能浮动

int a = ceil(256/11); //> Should be 24, but it is 23
              ^example

假设myIntNumbermyOtherInt都是正数,你可以这样做:

int myValue = (myIntNumber + myOtherInt - 1) / myOtherInt;

With help from DyP, came up with the following branchless formula:在 DyP 的帮助下,提出了以下无分支公式:

int idiv_ceil ( int numerator, int denominator )
{
    return numerator / denominator
             + (((numerator < 0) ^ (denominator > 0)) && (numerator%denominator));
}

It avoids floating-point conversions and passes a basic suite of unit tests, as shown here:它避免了浮点转换并通过了一组基本的单元测试,如下所示:


Here's another version that avoids the modulo operator.这是另一个避免模运算符的版本。

int idiv_ceil ( int numerator, int denominator )
{
    int truncated = numerator / denominator;
    return truncated + (((numerator < 0) ^ (denominator > 0)) &&
                                             (numerator - truncated*denominator));
}

The first one will be faster on processors where IDIV returns both quotient and remainder (and the compiler is smart enough to use that).第一个在 IDIV 返回商和余数的处理器上会更快(编译器足够聪明,可以使用它)。

Maybe it is just easier to do a:也许做一个更容易:

int result = dividend / divisor;
if(dividend % divisor != 0)
    result++;

Benchmarks基准

Since a lot of different methods are shown in the answers and none of the answers actually prove any advantages in terms of performance I tried to benchmark them myself.由于答案中显示了许多不同的方法,并且没有一个答案实际上证明在性能方面有任何优势,因此我尝试自己对它们进行基准测试。 My plan was to write an answer that contains a short table and a definite answer which method is the fastest.我的计划是写一个答案,其中包含一个简短的表格和一个确定的答案,哪种方法最快。

Unfortunately it wasn't that simple.不幸的是,事情并没有那么简单。 (It never is.) It seems that the performance of the rounding formulas depend on the used data type, compiler and optimization level. (从来没有。)似乎舍入公式的性能取决于使用的数据类型、编译器优化级别。

In one case there is an increase of speed by 7.5× from one method to another.在一种情况下,从一种方法到另一种方法的速度提高了 7.5 倍。 So the impact can be significant for some people.因此,对某些人来说,影响可能很大。

TL;DR TL; 博士

For long integers the naive version using a type cast to float and std::ceil was actually the fastest.对于long整数,使用类型转换为floatstd::ceil的天真版本实际上是最快的。 This was interesting for me personally since I intended to use it with size_t which is usually defined as unsigned long .这对我个人来说很有趣,因为我打算将它与size_t一起使用,它通常被定义为unsigned long

For ordinary int s it depends on your optimization level.对于普通int s,它取决于您的优化级别。 For lower levels @Jwodder's solution performs best.对于较低级别,@Jwodder 的解决方案表现最佳。 For the highest levels std::ceil was the optimal one.对于最高级别std::ceil是最佳的。 With one exception: For the clang/ unsigned int combination @Jwodder's was still better.除了一个例外:对于 clang/ unsigned int组合,@Jwodder's 仍然更好。

The solutions from the accepted answer never really outperformed the other two.接受的答案中的解决方案从未真正胜过其他两个。 You should keep in mind however that @Jwodder's solution doesn't work with negatives.但是,您应该记住,@Jwodder 的解决方案不适用于底片。

Results are at the bottom.结果在底部。

Implementations实现

To recap here are the four methods I benchmarked and compared:在这里回顾一下我进行基准测试和比较的四种方法:

@Jwodder's version (Jwodder) @Jwodder 的版本(Jwodder)

template<typename T>
inline T divCeilJwodder(const T& numerator, const T& denominator)
{
    return (numerator + denominator - 1) / denominator;
}

@Ben Voigt's version using modulo (VoigtModulo) @Ben Voigt 的版本使用模 (VoigtModulo)

template<typename T>
inline T divCeilVoigtModulo(const T& numerator, const T& denominator)
{
    return numerator / denominator + (((numerator < 0) ^ (denominator > 0))
        && (numerator%denominator));
}

@Ben Voigt's version without using modulo (VoigtNoModulo) @Ben Voigt 的版本不使用模 (VoigtNoModulo)

inline T divCeilVoigtNoModulo(const T& numerator, const T& denominator)
{
    T truncated = numerator / denominator;
    return truncated + (((numerator < 0) ^ (denominator > 0))
        && (numerator - truncated*denominator));
}

OP's implementation (TypeCast) OP 的实现(TypeCast)

template<typename T>
inline T divCeilTypeCast(const T& numerator, const T& denominator)
{
    return (int)std::ceil((double)numerator / denominator);
}

Methodology方法

In a single batch the division is performed 100 million times.在单个批次中,分裂进行了 1 亿次。 Ten batches are calculated for each combination of Compiler/Optimization level, used data type and used implementation.为编译器/优化级别、使用的数据类型和使用的实现的每个组合计算十个批次。 The values shown below are the averages of all 10 batches in milliseconds.下面显示的值是以毫秒为单位的所有 10 个批次的平均值。 The errors that are given are standard deviations .给出的误差是标准偏差

The whole source code that was used can be found here .可以在此处找到所使用的整个源代码。 Also you might find this script useful which compiles and executes the source with different compiler flags.此外,您可能会发现脚本很有用,它使用不同的编译器标志编译和执行源代码。

The whole benchmark was performed on a i7-7700K.整个基准测试是在 i7-7700K 上执行的。 The used compiler versions were GCC 10.2.0 and clang 11.0.1.使用的编译器版本是 GCC 10.2.0 和 clang 11.0.1。

Results结果

Now without further ado here are the results:现在不用多说,这里是结果:

DataType
Algorithm算法
GCC海湾合作委员会
-O0 -O0
-O1 -O1 -O2 -O2 -O3 -O3 -Os -Os -Ofast -Ofast -Og -Og clang
-O0 -O0
-O1 -O1 -O2 -O2 -O3 -O3 -Ofast -Ofast -Os -Os -Oz -Oz
unsigned
Jwodder乔德 264.1 ± 0.9 🏆 264.1±0.9 🏆 175.2 ± 0.9 🏆 175.2±0.9 🏆 153.5 ± 0.7 🏆 153.5±0.7 🏆 175.2 ± 0.5 🏆 175.2±0.5 🏆 153.3 ± 0.5 153.3±0.5 153.4 ± 0.8 153.4±0.8 175.5 ± 0.6 🏆 175.5±0.6 🏆 329.4 ± 1.3 🏆 329.4±1.3 🏆 220.0 ± 1.3 🏆 220.0 ± 1.3 🏆 146.2 ± 0.6 🏆 146.2±0.6 🏆 146.2 ± 0.6 🏆 146.2±0.6 🏆 146.0 ± 0.5 🏆 146.0 ± 0.5 🏆 153.2 ± 0.3 🏆 153.2±0.3 🏆 153.5 ± 0.6 🏆 153.5±0.6 🏆
VoigtModulo VoigtModulo 528.5 ± 2.5 528.5±2.5 306.5 ± 1.0 306.5±1.0 175.8 ± 0.7 175.8±0.7 175.2 ± 0.5 🏆 175.2±0.5 🏆 175.6 ± 0.7 175.6±0.7 175.4 ± 0.6 175.4±0.6 352.0 ± 1.0 352.0±1.0 588.9 ± 6.4 588.9±6.4 408.7 ± 1.5 408.7±1.5 164.8 ± 1.0 164.8±1.0 164.0 ± 0.4 164.0±0.4 164.1 ± 0.4 164.1±0.4 175.2 ± 0.5 175.2±0.5 175.8 ± 0.9 175.8±0.9
VoigtNoModulo VoigtNoModulo 375.3 ± 1.5 375.3±1.5 175.7 ± 1.3 🏆 175.7±1.3 🏆 192.5 ± 1.4 192.5±1.4 197.6 ± 1.9 197.6±1.9 200.6 ± 7.2 200.6±7.2 176.1 ± 1.5 176.1±1.5 197.9 ± 0.5 197.9±0.5 541.0 ± 1.8 541.0±1.8 263.1 ± 0.9 263.1±0.9 186.4 ± 0.6 186.4±0.6 186.4 ± 1.2 186.4±1.2 187.2 ± 1.1 187.2±1.1 197.2 ± 0.8 197.2±0.8 197.1 ± 0.7 197.1±0.7
TypeCast类型转换 348.5 ± 2.7 348.5±2.7 231.9 ± 3.9 231.9±3.9 234.4 ± 1.3 234.4±1.3 226.6 ± 1.0 226.6±1.0 137.5 ± 0.8 🏆 137.5±0.8 🏆 138.7 ± 1.7 🏆 138.7±1.7 🏆 243.8 ± 1.4 243.8±1.4 591.2 ± 2.4 591.2±2.4 591.3 ± 2.6 591.3±2.6 155.8 ± 1.9 155.8±1.9 155.9 ± 1.6 155.9±1.6 155.9 ± 2.4 155.9±2.4 214.6 ± 1.9 214.6±1.9 213.6 ± 1.1 213.6±1.1
unsigned long
Jwodder乔德 658.6 ± 2.5 658.6±2.5 546.3 ± 0.9 546.3±0.9 549.3 ± 1.8 549.3±1.8 549.1 ± 2.8 549.1±2.8 540.6 ± 3.4 540.6±3.4 548.8 ± 1.3 548.8±1.3 486.1 ± 1.1 486.1±1.1 638.1 ± 1.8 638.1±1.8 613.3 ± 2.1 613.3±2.1 190.0 ± 0.8 🏆 190.0 ± 0.8 🏆 182.7 ± 0.5 182.7±0.5 182.4 ± 0.5 182.4±0.5 496.2 ± 1.3 496.2±1.3 554.1 ± 1.0 554.1±1.0
VoigtModulo VoigtModulo 1,169.0 ± 2.9 1,169.0 ± 2.9 1,015.9 ± 4.4 1,015.9 ± 4.4 550.8 ± 2.0 550.8±2.0 504.0 ± 1.4 504.0±1.4 550.3 ± 1.2 550.3±1.2 550.5 ± 1.3 550.5±1.3 1,020.1 ± 2.9 1,020.1 ± 2.9 1,259.0 ± 9.0 1,259.0 ± 9.0 1,136.5 ± 4.2 1,136.5 ± 4.2 187.0 ± 3.4 🏆 187.0 ± 3.4 🏆 199.7 ± 6.1 199.7±6.1 197.6 ± 1.0 197.6±1.0 549.4 ± 1.7 549.4±1.7 506.8 ± 4.4 506.8±4.4
VoigtNoModulo VoigtNoModulo 768.1 ± 1.7 768.1±1.7 559.1 ± 1.8 559.1±1.8 534.4 ± 1.6 534.4±1.6 533.7 ± 1.5 533.7±1.5 559.5 ± 1.7 559.5±1.7 534.3 ± 1.5 534.3±1.5 571.5 ± 1.3 571.5±1.3 879.5 ± 10.8 879.5±10.8 617.8 ± 2.1 617.8±2.1 223.4 ± 1.3 223.4±1.3 231.3 ± 1.3 231.3±1.3 231.4 ± 1.1 231.4±1.1 594.6 ± 1.9 594.6±1.9 572.2 ± 0.8 572.2±0.8
TypeCast类型转换 353.3 ± 2.5 🏆 353.3±2.5 🏆 267.5 ± 1.7 🏆 267.5±1.7 🏆 248.0 ± 1.6 🏆 248.0 ± 1.6 🏆 243.8 ± 1.2 🏆 243.8 ± 1.2 🏆 154.2 ± 0.8 🏆 154.2±0.8 🏆 154.1 ± 1.0 🏆 154.1 ± 1.0 🏆 263.8 ± 1.8 🏆 263.8 ± 1.8 🏆 365.5 ± 1.6 🏆 365.5±1.6 🏆 316.9 ± 1.8 🏆 316.9±1.8 🏆 189.7 ± 2.1 🏆 189.7±2.1 🏆 156.3 ± 1.8 🏆 156.3±1.8 🏆 157.0 ± 2.2 🏆 157.0 ± 2.2 🏆 155.1 ± 0.9 🏆 155.1±0.9 🏆 176.5 ± 1.2 🏆 176.5±1.2 🏆
int
Jwodder乔德 307.9 ± 1.3 🏆 307.9±1.3 🏆 175.4 ± 0.9 🏆 175.4±0.9 🏆 175.3 ± 0.5 🏆 175.3±0.5 🏆 175.4 ± 0.6 🏆 175.4±0.6 🏆 175.2 ± 0.5 175.2±0.5 175.1 ± 0.6 175.1±0.6 175.1 ± 0.5 🏆 175.1±0.5 🏆 307.4 ± 1.2 🏆 307.4±1.2 🏆 219.6 ± 0.6 🏆 219.6±0.6 🏆 146.0 ± 0.3 🏆 146.0 ± 0.3 🏆 153.5 ± 0.5 153.5±0.5 153.6 ± 0.8 153.6±0.8 175.4 ± 0.7 🏆 175.4±0.7 🏆 175.2 ± 0.5 🏆 175.2±0.5 🏆
VoigtModulo VoigtModulo 528.5 ± 1.9 528.5±1.9 351.9 ± 4.6 351.9±4.6 175.3 ± 0.6 🏆 175.3±0.6 🏆 175.2 ± 0.4 🏆 175.2±0.4 🏆 197.1 ± 0.6 197.1±0.6 175.2 ± 0.8 175.2±0.8 373.5 ± 1.1 373.5±1.1 598.7 ± 5.1 598.7±5.1 460.6 ± 1.3 460.6±1.3 175.4 ± 0.4 175.4±0.4 164.3 ± 0.9 164.3±0.9 164.0 ± 0.4 164.0±0.4 176.3 ± 1.6 🏆 176.3±1.6 🏆 460.0 ± 0.8 460.0±0.8
VoigtNoModulo VoigtNoModulo 398.0 ± 2.5 398.0±2.5 241.0 ± 0.7 241.0±0.7 199.4 ± 5.1 199.4±5.1 219.2 ± 1.0 219.2±1.0 175.9 ± 1.2 175.9±1.2 197.7 ± 1.2 197.7±1.2 242.9 ± 3.0 242.9±3.0 543.5 ± 2.3 543.5±2.3 350.6 ± 1.0 350.6±1.0 186.6 ± 1.2 186.6±1.2 185.7 ± 0.3 185.7±0.3 186.3 ± 1.1 186.3±1.1 197.1 ± 0.6 197.1±0.6 373.3 ± 1.6 373.3±1.6
TypeCast类型转换 338.8 ± 4.9 338.8±4.9 228.1 ± 0.9 228.1±0.9 230.3 ± 0.8 230.3±0.8 229.5 ± 9.4 229.5±9.4 153.8 ± 0.4 🏆 153.8 ± 0.4 🏆 138.3 ± 1.0 🏆 138.3±1.0 🏆 241.1 ± 1.1 241.1±1.1 590.0 ± 2.1 590.0±2.1 589.9 ± 0.8 589.9±0.8 155.2 ± 2.4 155.2±2.4 149.4 ± 1.6 🏆 149.4±1.6 🏆 148.4 ± 1.2 🏆 148.4±1.2 🏆 214.6 ± 2.2 214.6±2.2 211.7 ± 1.6 211.7±1.6
long
Jwodder乔德 758.1 ± 1.8 758.1±1.8 600.6 ± 0.9 600.6±0.9 601.5 ± 2.2 601.5±2.2 601.5 ± 2.8 601.5±2.8 581.2 ± 1.9 581.2±1.9 600.6 ± 1.8 600.6±1.8 586.3 ± 3.6 586.3±3.6 745.9 ± 3.6 745.9±3.6 685.8 ± 2.2 685.8±2.2 183.1 ± 1.0 183.1±1.0 182.5 ± 0.5 182.5±0.5 182.6 ± 0.6 182.6±0.6 553.2 ± 1.5 553.2±1.5 488.0 ± 0.8 488.0±0.8
VoigtModulo VoigtModulo 1,360.8 ± 6.1 1,360.8 ± 6.1 1,202.0 ± 2.1 1,202.0 ± 2.1 600.0 ± 2.4 600.0±2.4 600.0 ± 3.0 600.0±3.0 607.0 ± 6.8 607.0±6.8 599.0 ± 1.6 599.0±1.6 1,187.2 ± 2.6 1,187.2 ± 2.6 1,439.6 ± 6.7 1,439.6 ± 6.7 1,346.5 ± 2.9 1,346.5 ± 2.9 197.9 ± 0.7 197.9±0.7 208.2 ± 0.6 208.2±0.6 208.0 ± 0.4 208.0±0.4 548.9 ± 1.4 548.9±1.4 1,326.4 ± 3.0 1,326.4 ± 3.0
VoigtNoModulo VoigtNoModulo 844.5 ± 6.9 844.5±6.9 647.3 ± 1.3 647.3±1.3 628.9 ± 1.8 628.9±1.8 627.9 ± 1.6 627.9±1.6 629.1 ± 2.4 629.1±2.4 629.6 ± 4.4 629.6±4.4 668.2 ± 2.7 668.2±2.7 1,019.5 ± 3.2 1,019.5 ± 3.2 715.1 ± 8.2 715.1±8.2 224.3 ± 4.8 224.3±4.8 219.0 ± 1.0 219.0±1.0 219.0 ± 0.6 219.0±0.6 561.7 ± 2.5 561.7±2.5 769.4 ± 9.3 769.4±9.3
TypeCast类型转换 366.1 ± 0.8 🏆 366.1±0.8 🏆 246.2 ± 1.1 🏆 246.2±1.1 🏆 245.3 ± 1.8 🏆 245.3±1.8 🏆 244.6 ± 1.1 🏆 244.6±1.1 🏆 154.6 ± 1.6 🏆 154.6±1.6 🏆 154.3 ± 0.5 🏆 154.3±0.5 🏆 257.4 ± 1.5 🏆 257.4±1.5 🏆 591.8 ± 4.1 🏆 591.8±4.1 🏆 590.4 ± 1.3 🏆 590.4±1.3 🏆 154.5 ± 1.3 🏆 154.5±1.3 🏆 135.4 ± 8.3 🏆 135.4±8.3 🏆 132.9 ± 0.7 🏆 132.9±0.7 🏆 132.8 ± 1.2 🏆 132.8 ± 1.2 🏆 177.4 ± 2.3 🏆 177.4±2.3 🏆

Now I can finally get on with my life :P现在我终于可以继续我的生活了:P

Integer division with round-up.整数除法与舍入。

Only 1 division executed per call, no % or * or conversion to/from floating point, works for positive and negative int .每次调用只执行 1 个除法,没有%*或转换为/从浮点数,适用于正负int See note (1).见注 (1)。

n (numerator) = OPs myIntNumber;  
d (denominator) = OPs myOtherInt;

The following approach is simple.下面的方法很简单。 int division rounds toward 0. For negative quotients, this is a round up so nothing special is needed. int除法向 0 舍入。对于负商,这是一个向上舍入,因此不需要任何特别的东西。 For positive quotients, add d-1 to effect a round up, then perform an unsigned division.对于正商,添加d-1以实现向上取整,然后执行无符号除法。

Note (1) The usual divide by 0 blows things up and MININT/-1 fails as expected on 2's compliment machines.注意 (1) 通常的除以0会把事情MININT/-1MININT/-1在 2 的恭维机器上按预期失败。

int IntDivRoundUp(int n, int d) {
  // If n and d are the same sign ... 
  if ((n < 0) == (d < 0)) {
    // If n (and d) are negative ...
    if (n < 0) {
      n = -n;
      d = -d;
    }
    // Unsigned division rounds down.  Adding d-1 to n effects a round up.
    return (((unsigned) n) + ((unsigned) d) - 1)/((unsigned) d);  
  }
  else {
    return n/d;
  }
}

[Edit: test code removed, see earlier rev as needed] [编辑:删除了测试代码,根据需要查看早期版本]

Just use只需使用

int ceil_of_division = ((dividend-1)/divisor)+1;

For example:例如:

for (int i=0;i<20;i++)
    std::cout << i << "/8 = " << ((i-1)/8)+1 << std::endl;

A small hack is to do:一个小技巧是:

int divideUp(int a, int b) {
    result = (a-1)/b + 1;
}
// Proof:
a = b*N + k (always)
if k == 0, then
  (a-1)       == b*N  - 1
  (a-1)/b     == N - 1
  (a-1)/b + 1 == N ---> Good !

if k > 0, then
  (a-1)       == b*N   + l
  (a-1)/b     == N
  (a-1)/b + 1 == N+1  ---> Good !

Instead of using the ceil function before casting to int, you can add a constant which is very nearly (but not quite) equal to 1 - this way, nearly anything (except a value which is exactly or incredibly close to an actual integer) will be increased by one before it is truncated.您可以添加一个非常接近(但不完全)等于 1 的常量,而不是在转换为 int 之前使用 ceil 函数 - 这样,几乎任何东西(除了完全或非常接近实际整数的值)都将在被截断之前加一。

Example:例子:

#define EPSILON (0.9999)

int myValue = (int)(((float)myIntNumber)/myOtherInt + EPSILON);

EDIT: after seeing your response to the other post, I want to clarify that this will round up, not away from zero - negative numbers will become less negative, and positive numbers will become more positive.编辑:在看到您对另一篇文章的回复后,我想澄清一下,这将向上取整,而不是远离零 - 负数将变得不那么负,正数将变得更正。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM