简体   繁体   中英

Matlab nchoosek got difference answer using int64 and sym

This is a question about the function nchoosek in Matlab.

I want to find nchoosek(54,25) , which is the same as 54C25. Since the answer is about 10^15, I originally use int64 . However the answer is wrong with respect to the symbolic one.

Input:

nchoosek(int64(54),int64(25))
nchoosek(sym(54),sym(25))

Output:

1683191473897753
1683191473897752

You can see that they differ by one. This is not really an urgent problem since I now use sym . However can someone tell me why this happens?


EDIT:

I am using R2013a.

I take a look at the nchoosek.m , and find that if the input are in int64 , the code can be simplified into

function c = nchoosek2(v,k)

    n = v;  % rename v to be n. the algorithm is more readable this way.

    classOut = 'int64';
    nd = double(n);
    kd = double(k);
    nums = (nd-kd+1):nd;
    dens = 1:kd;
    nums = nums./dens;      %%
    c = round(prod(nums));
    c = cast(c,classOut);
end

However, the outcome of int64(prod(nums./dens)) is different from prod(sym(nums)./sym(dens)) for me. Is this the same for everyone?

I don't have this problem on R2014a:

Numeric

>> n = int64(54);
>> k = int64(25);
>> nchoosek(n,k)
ans =
     1683191473897752    % class(ans) == int64

Symbolic

>> nn = sym(n);
>> kk = sym(k);
>> nchoosek(nn,kk)
ans =
1683191473897752         % class(ans) == sym

% N!/((N-K)! K!)
>> factorial(nn) / (factorial(nn-kk) * factorial(kk))
ans =
1683191473897752         % class(ans) == sym

If you check the source code of the function edit nchoosek.m , you'll see it specifically handles the case of 64-bit integers using a separate algorithm. I won't reproduce the code here, but here are the highlights:

function c = nchoosek(v,k)
    ...

    if int64type
        % For 64-bit integers, use an algorithm that avoids
        % converting to doubles
        c = binCoef(n,k,classOut);
    else
        % Do the computation in doubles.
        ...
    end

    ....
end

function c = binCoef(n,k,classOut)
    % For integers, compute N!/((N-K)! K!) using prime factor cancellations
    ...
end

In 2013a this can be reproduced...

There is as @Amro shows a special case in nchoosek for classOut of int64 or unit64,
however in 2013a this is only applied when the answer is between

  • flintmax (with no argument) and
  • double(intmax(classOut)) + 2*eps(double(intmax(classOut)))

which for int64 gives 9007199254740992 & 9223372036854775808, which the solution does not lie between...


If the solution had fallen between these values it would be recalculated using the subfunction binCoef for which the help states: For integers, compute N!/((NK)! M!) using prime factor cancellations

The binCoef function would have produced the right answer for the given int64 inputs

In 2013a with these inputs binCoef is not called

Instead the "default" pascals triangle method is used in which:

  • Inputs are cast to double
  • The product of the vector ((n-k+1):n)./(1:k) is taken
  • this vector contains k double representations of fractions.

So what we have is almost certainly floating point error .


What can be done?

Two options I can see;

  1. Make your own function based on the code in binCoef,
  2. Modify nchoosek and remove && c >= flintmax from line 81

Removing this expression will force Matlab to use the more accurate integer based calculation for inputs of int64 and uint64 for any values within their precision. This will be slightly slower but will avoid floating point errors, which are rightfully unexpected when working with integer types.

Option one - should be fairly straight forward...

Option two - I recommend keeping an unchanged backup of the original function, or makeing a copy of the function with the modification and use that instead.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM