简体   繁体   中英

Dividing two integers and rounding up the result, without using floating point

I need to divide two numbers and round it up. Are there any better way to do this?

int myValue = (int) ceil( (float)myIntNumber / myOtherInt );

I find an overkill to have to cast two different time. (the extern int cast is just to shut down the warning)

Note I have to cast internally to float otherwise

int a = ceil(256/11); //> Should be 24, but it is 23
              ^example

假设myIntNumbermyOtherInt都是正数,你可以这样做:

int myValue = (myIntNumber + myOtherInt - 1) / myOtherInt;

With help from DyP, came up with the following branchless formula:

int idiv_ceil ( int numerator, int denominator )
{
    return numerator / denominator
             + (((numerator < 0) ^ (denominator > 0)) && (numerator%denominator));
}

It avoids floating-point conversions and passes a basic suite of unit tests, as shown here:


Here's another version that avoids the modulo operator.

int idiv_ceil ( int numerator, int denominator )
{
    int truncated = numerator / denominator;
    return truncated + (((numerator < 0) ^ (denominator > 0)) &&
                                             (numerator - truncated*denominator));
}

The first one will be faster on processors where IDIV returns both quotient and remainder (and the compiler is smart enough to use that).

Maybe it is just easier to do a:

int result = dividend / divisor;
if(dividend % divisor != 0)
    result++;

Benchmarks

Since a lot of different methods are shown in the answers and none of the answers actually prove any advantages in terms of performance I tried to benchmark them myself. My plan was to write an answer that contains a short table and a definite answer which method is the fastest.

Unfortunately it wasn't that simple. (It never is.) It seems that the performance of the rounding formulas depend on the used data type, compiler and optimization level.

In one case there is an increase of speed by 7.5× from one method to another. So the impact can be significant for some people.

TL;DR

For long integers the naive version using a type cast to float and std::ceil was actually the fastest. This was interesting for me personally since I intended to use it with size_t which is usually defined as unsigned long .

For ordinary int s it depends on your optimization level. For lower levels @Jwodder's solution performs best. For the highest levels std::ceil was the optimal one. With one exception: For the clang/ unsigned int combination @Jwodder's was still better.

The solutions from the accepted answer never really outperformed the other two. You should keep in mind however that @Jwodder's solution doesn't work with negatives.

Results are at the bottom.

Implementations

To recap here are the four methods I benchmarked and compared:

@Jwodder's version (Jwodder)

template<typename T>
inline T divCeilJwodder(const T& numerator, const T& denominator)
{
    return (numerator + denominator - 1) / denominator;
}

@Ben Voigt's version using modulo (VoigtModulo)

template<typename T>
inline T divCeilVoigtModulo(const T& numerator, const T& denominator)
{
    return numerator / denominator + (((numerator < 0) ^ (denominator > 0))
        && (numerator%denominator));
}

@Ben Voigt's version without using modulo (VoigtNoModulo)

inline T divCeilVoigtNoModulo(const T& numerator, const T& denominator)
{
    T truncated = numerator / denominator;
    return truncated + (((numerator < 0) ^ (denominator > 0))
        && (numerator - truncated*denominator));
}

OP's implementation (TypeCast)

template<typename T>
inline T divCeilTypeCast(const T& numerator, const T& denominator)
{
    return (int)std::ceil((double)numerator / denominator);
}

Methodology

In a single batch the division is performed 100 million times. Ten batches are calculated for each combination of Compiler/Optimization level, used data type and used implementation. The values shown below are the averages of all 10 batches in milliseconds. The errors that are given are standard deviations .

The whole source code that was used can be found here . Also you might find this script useful which compiles and executes the source with different compiler flags.

The whole benchmark was performed on a i7-7700K. The used compiler versions were GCC 10.2.0 and clang 11.0.1.

Results

Now without further ado here are the results:

DataType
Algorithm
GCC
-O0
-O1 -O2 -O3 -Os -Ofast -Og clang
-O0
-O1 -O2 -O3 -Ofast -Os -Oz
unsigned
Jwodder 264.1 ± 0.9 🏆 175.2 ± 0.9 🏆 153.5 ± 0.7 🏆 175.2 ± 0.5 🏆 153.3 ± 0.5 153.4 ± 0.8 175.5 ± 0.6 🏆 329.4 ± 1.3 🏆 220.0 ± 1.3 🏆 146.2 ± 0.6 🏆 146.2 ± 0.6 🏆 146.0 ± 0.5 🏆 153.2 ± 0.3 🏆 153.5 ± 0.6 🏆
VoigtModulo 528.5 ± 2.5 306.5 ± 1.0 175.8 ± 0.7 175.2 ± 0.5 🏆 175.6 ± 0.7 175.4 ± 0.6 352.0 ± 1.0 588.9 ± 6.4 408.7 ± 1.5 164.8 ± 1.0 164.0 ± 0.4 164.1 ± 0.4 175.2 ± 0.5 175.8 ± 0.9
VoigtNoModulo 375.3 ± 1.5 175.7 ± 1.3 🏆 192.5 ± 1.4 197.6 ± 1.9 200.6 ± 7.2 176.1 ± 1.5 197.9 ± 0.5 541.0 ± 1.8 263.1 ± 0.9 186.4 ± 0.6 186.4 ± 1.2 187.2 ± 1.1 197.2 ± 0.8 197.1 ± 0.7
TypeCast 348.5 ± 2.7 231.9 ± 3.9 234.4 ± 1.3 226.6 ± 1.0 137.5 ± 0.8 🏆 138.7 ± 1.7 🏆 243.8 ± 1.4 591.2 ± 2.4 591.3 ± 2.6 155.8 ± 1.9 155.9 ± 1.6 155.9 ± 2.4 214.6 ± 1.9 213.6 ± 1.1
unsigned long
Jwodder 658.6 ± 2.5 546.3 ± 0.9 549.3 ± 1.8 549.1 ± 2.8 540.6 ± 3.4 548.8 ± 1.3 486.1 ± 1.1 638.1 ± 1.8 613.3 ± 2.1 190.0 ± 0.8 🏆 182.7 ± 0.5 182.4 ± 0.5 496.2 ± 1.3 554.1 ± 1.0
VoigtModulo 1,169.0 ± 2.9 1,015.9 ± 4.4 550.8 ± 2.0 504.0 ± 1.4 550.3 ± 1.2 550.5 ± 1.3 1,020.1 ± 2.9 1,259.0 ± 9.0 1,136.5 ± 4.2 187.0 ± 3.4 🏆 199.7 ± 6.1 197.6 ± 1.0 549.4 ± 1.7 506.8 ± 4.4
VoigtNoModulo 768.1 ± 1.7 559.1 ± 1.8 534.4 ± 1.6 533.7 ± 1.5 559.5 ± 1.7 534.3 ± 1.5 571.5 ± 1.3 879.5 ± 10.8 617.8 ± 2.1 223.4 ± 1.3 231.3 ± 1.3 231.4 ± 1.1 594.6 ± 1.9 572.2 ± 0.8
TypeCast 353.3 ± 2.5 🏆 267.5 ± 1.7 🏆 248.0 ± 1.6 🏆 243.8 ± 1.2 🏆 154.2 ± 0.8 🏆 154.1 ± 1.0 🏆 263.8 ± 1.8 🏆 365.5 ± 1.6 🏆 316.9 ± 1.8 🏆 189.7 ± 2.1 🏆 156.3 ± 1.8 🏆 157.0 ± 2.2 🏆 155.1 ± 0.9 🏆 176.5 ± 1.2 🏆
int
Jwodder 307.9 ± 1.3 🏆 175.4 ± 0.9 🏆 175.3 ± 0.5 🏆 175.4 ± 0.6 🏆 175.2 ± 0.5 175.1 ± 0.6 175.1 ± 0.5 🏆 307.4 ± 1.2 🏆 219.6 ± 0.6 🏆 146.0 ± 0.3 🏆 153.5 ± 0.5 153.6 ± 0.8 175.4 ± 0.7 🏆 175.2 ± 0.5 🏆
VoigtModulo 528.5 ± 1.9 351.9 ± 4.6 175.3 ± 0.6 🏆 175.2 ± 0.4 🏆 197.1 ± 0.6 175.2 ± 0.8 373.5 ± 1.1 598.7 ± 5.1 460.6 ± 1.3 175.4 ± 0.4 164.3 ± 0.9 164.0 ± 0.4 176.3 ± 1.6 🏆 460.0 ± 0.8
VoigtNoModulo 398.0 ± 2.5 241.0 ± 0.7 199.4 ± 5.1 219.2 ± 1.0 175.9 ± 1.2 197.7 ± 1.2 242.9 ± 3.0 543.5 ± 2.3 350.6 ± 1.0 186.6 ± 1.2 185.7 ± 0.3 186.3 ± 1.1 197.1 ± 0.6 373.3 ± 1.6
TypeCast 338.8 ± 4.9 228.1 ± 0.9 230.3 ± 0.8 229.5 ± 9.4 153.8 ± 0.4 🏆 138.3 ± 1.0 🏆 241.1 ± 1.1 590.0 ± 2.1 589.9 ± 0.8 155.2 ± 2.4 149.4 ± 1.6 🏆 148.4 ± 1.2 🏆 214.6 ± 2.2 211.7 ± 1.6
long
Jwodder 758.1 ± 1.8 600.6 ± 0.9 601.5 ± 2.2 601.5 ± 2.8 581.2 ± 1.9 600.6 ± 1.8 586.3 ± 3.6 745.9 ± 3.6 685.8 ± 2.2 183.1 ± 1.0 182.5 ± 0.5 182.6 ± 0.6 553.2 ± 1.5 488.0 ± 0.8
VoigtModulo 1,360.8 ± 6.1 1,202.0 ± 2.1 600.0 ± 2.4 600.0 ± 3.0 607.0 ± 6.8 599.0 ± 1.6 1,187.2 ± 2.6 1,439.6 ± 6.7 1,346.5 ± 2.9 197.9 ± 0.7 208.2 ± 0.6 208.0 ± 0.4 548.9 ± 1.4 1,326.4 ± 3.0
VoigtNoModulo 844.5 ± 6.9 647.3 ± 1.3 628.9 ± 1.8 627.9 ± 1.6 629.1 ± 2.4 629.6 ± 4.4 668.2 ± 2.7 1,019.5 ± 3.2 715.1 ± 8.2 224.3 ± 4.8 219.0 ± 1.0 219.0 ± 0.6 561.7 ± 2.5 769.4 ± 9.3
TypeCast 366.1 ± 0.8 🏆 246.2 ± 1.1 🏆 245.3 ± 1.8 🏆 244.6 ± 1.1 🏆 154.6 ± 1.6 🏆 154.3 ± 0.5 🏆 257.4 ± 1.5 🏆 591.8 ± 4.1 🏆 590.4 ± 1.3 🏆 154.5 ± 1.3 🏆 135.4 ± 8.3 🏆 132.9 ± 0.7 🏆 132.8 ± 1.2 🏆 177.4 ± 2.3 🏆

Now I can finally get on with my life :P

Integer division with round-up.

Only 1 division executed per call, no % or * or conversion to/from floating point, works for positive and negative int . See note (1).

n (numerator) = OPs myIntNumber;  
d (denominator) = OPs myOtherInt;

The following approach is simple. int division rounds toward 0. For negative quotients, this is a round up so nothing special is needed. For positive quotients, add d-1 to effect a round up, then perform an unsigned division.

Note (1) The usual divide by 0 blows things up and MININT/-1 fails as expected on 2's compliment machines.

int IntDivRoundUp(int n, int d) {
  // If n and d are the same sign ... 
  if ((n < 0) == (d < 0)) {
    // If n (and d) are negative ...
    if (n < 0) {
      n = -n;
      d = -d;
    }
    // Unsigned division rounds down.  Adding d-1 to n effects a round up.
    return (((unsigned) n) + ((unsigned) d) - 1)/((unsigned) d);  
  }
  else {
    return n/d;
  }
}

[Edit: test code removed, see earlier rev as needed]

Just use

int ceil_of_division = ((dividend-1)/divisor)+1;

For example:

for (int i=0;i<20;i++)
    std::cout << i << "/8 = " << ((i-1)/8)+1 << std::endl;

A small hack is to do:

int divideUp(int a, int b) {
    result = (a-1)/b + 1;
}
// Proof:
a = b*N + k (always)
if k == 0, then
  (a-1)       == b*N  - 1
  (a-1)/b     == N - 1
  (a-1)/b + 1 == N ---> Good !

if k > 0, then
  (a-1)       == b*N   + l
  (a-1)/b     == N
  (a-1)/b + 1 == N+1  ---> Good !

Instead of using the ceil function before casting to int, you can add a constant which is very nearly (but not quite) equal to 1 - this way, nearly anything (except a value which is exactly or incredibly close to an actual integer) will be increased by one before it is truncated.

Example:

#define EPSILON (0.9999)

int myValue = (int)(((float)myIntNumber)/myOtherInt + EPSILON);

EDIT: after seeing your response to the other post, I want to clarify that this will round up, not away from zero - negative numbers will become less negative, and positive numbers will become more positive.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM