[Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance

octave-bug-tracker

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance

From:	Hendrik K
Subject:	[Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance
Date:	Tue, 4 Apr 2023 01:30:23 -0400 (EDT)

Follow-up Comment #2, bug #63992 (project octave):

Quite some investigation was required:

i) unit16
Nice catch from Arun and I changed the function from int64_t to uint64_t.

Note that this was not the main culprit giving zero for nchoosek (intmax
("uint64")-1, 1).
When doing a cast to (uint64) or an std:trunc when calling the Octave c++
method .double_value () with unit64 max, the result is ALWAYS Zero. 

I did not investigate further why the octave method .double_value () on an
intmax ("uint64")-1 produces such results. One needs a C++-Compiler/Octave C++
Compiler expert to solve this. I entered a separate bug report
https://savannah.gnu.org/bugs/?63995.

But as I wanted to cover also more extreme cases like nchoosek (x, 1) with x >
intmax ("uint64"), I used the same approach as nchoosek.m (which uses fix on a
double) version: I remain in the double “world” for too large numbers and
created an overload function n_over_k accepting a double. Now it also works
for uint64 max and beyond values.

Note: But the program does (yet) not work correctly for intmax ("uint64")-
small X.  Using  .double_value the program receives 18446744073709551616 as
the double value and returns this value and not the correct
18446744073709551615. So bug 63995 needs to be solved first to cover this....

In any case we are deep in numerical issues as a double cannot represent all
unit64_t:  A double, assuming it uses IEE754 double-precision representation,
can only hold 53 bits of precision. A uint64_t uses all 64 bits as value bits,
meaning that there are some values that can be stored exactly in a uint64_t
that cannot be stored exactly in a double.
This leads to the next point.


ii) Once we go beyond “flintmax”, all kind of “odd” result can be
expected. Any algorithm may be 100% correct, but will produce incorrect real
work/analytical results whilst being 100% numerically correct.
For example:

double(intmax ("uint64")-1) == double(intmax ("uint64")-1000) is TRUE.

So any deviation of 1000 or above on large integer number beyond
“flintmax” due to rounding errors does not shock me. It is not without
reasons nchoosek gives a warning “possible loss of precision”.

Still, for values < “ flintmax” the results using double should aim to be
correct.

Luckily this is not too difficult: Change slightly the order of processing in
n_over_k and we can show that we have always correct for values <
“flintmax” by at most one bit (+-1). The +-1 is I guess due to the various
settings and rules following IEEE floating-point arithmetic (IEC 60559)
supported by the Compiler (GNU in my case) and the processor used.  

 for n = 1:67
  for k = 0:(n/2)
    num = den = [];
    for j = 1:n
      num = [num, factor(j)];
    end
    for j = [1:k, 1:(n-k)]
      den = [den, factor(j)];
    end;
    while (~isempty(den))
      ii = find (den(1) == num, 1);
      num(ii) = den(1) = [];
    end
    nck = prod (uint64(num), "native");
    fn = nchoosek (uint64(n), uint64(k));
    if (fn ~= nck)
      delta = max (fn-nck, nck-fn);
      fprintf (1, "n=%2d k=%2d flintmax breached %d function=%u multiply=%u
delta=%d\n", n, k, nck > flintmax(), fn, nck, delta);
    end
  end
end

We can further improve this by removing the +-1 cases for values < “
flintmax” and the average typical deviations from the correct result
(compared to 100% correct integer arithmetic) > “flintmax” when working
internally with a long double which I did before giving back a double.  

Which means that for practical purposes when the result of nchoosek is further
processed using arithmetic operations by an octave program, it is numerically
correct.

Going beyond this will be very difficult I guess: We need to give back a
double and the returned double may not the same (in the double “world”) as
the double of a 100% correct integer arithmetic result beyond “flintmax”.
One would need to contact a  C++-Compiler/Octave Internal Expert to check if
anything further could be done with reasonable effort if at all. 

Thoughts  ?


iii) Usage of typeid
I have some concerns:
- The computer and compiler nowadays are lightening fast on simple C++
instructions like arithmetic and string comparisons. Typical bottlenecks are
rather complex algorithms and especially RAM access. So any change to typeid
is not expected to bring material benefits.
I did a quick check using std::chrono::nanoseconds: Going through all existing
string comparisons including retrieving the class name (so worst case in if
then else checks) takes less than 500 nanoseconds on my machine....

- Besides I did not find a typeid static constant distinguishing “single”
from “double”, only “is_float” which caters for both. E.g. the code in
octave bitfcns.cc also relies on the classname to distinguish “single”
from “doubles”.  So using the string class name cannot be avoided or using
other methods like is_single() and is_double() for which the performance may
be less good than a static int or string comparison. Dito for a single char
for which one  is_char() may be used etc. but I did not dig deeper. 

- The code readability will not be improved by using typeid,  as one needs to
distinguish matrix vs. scalar and as typeid may not work for all classes.

if  (args(0).type_id () == octave_uint32_matrix::static_type_id ()
                   || args(0).type_id () ==
octave_uint32_scalar::static_type_id ())
 
vs 

if (cln == "int32")


Thoughts  ?

 
I attach a new patch catering for i) and ii).

Note that for nchoosek (intmax ("uint64")-1, 1) working 100% correctly bug
63995 needs to be solved first. But the result is beyond flintmax.



@Arun-
Thanks for the points raised.

(file #54554)

    _______________________________________________________

Additional Item Attachment:

File name: nchoosek_v1                    Size:24 KB
    <https://file.savannah.gnu.org/file/nchoosek_v1?file_id=54554>



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?63992>

_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/

[Prev in Thread]

Current Thread

[Next in Thread]

[Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance, Hendrik K, 2023/04/03
- [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance, Arun Giridhar, 2023/04/03
  - [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance, Hendrik K <=
    - [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance, John W. Eaton, 2023/04/04
    - [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance, anonymous, 2023/04/04
    - Message not available
    - [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance, Arun Giridhar, 2023/04/04
    - [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance, Hendrik K, 2023/04/04
    - [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance, Arun Giridhar, 2023/04/04
    - [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance, Hendrik K, 2023/04/05

Prev by Date: [Octave-bug-tracker] [bug #40117] Corruption of symbol table when editing functions in private/ dirs
Next by Date: [Octave-bug-tracker] [bug #63995] Octave c++ method .double_value on (intmax ("uint64")-1)
Previous by thread: [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance
Next by thread: [Octave-bug-tracker] [bug #63992] nchoosek has suboptimal performance
Index(es):
- Date
- Thread