guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CPU and GC cost of bignums


From: Ludovic Courtès
Subject: Re: CPU and GC cost of bignums
Date: Thu, 06 Feb 2020 14:37:52 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)

Hi!

Andy Wingo <address@hidden> skribis:

> Nice investigation!  Perhaps slot-allocation should track live variables
> using something that's not bigints, but who knows.

Yeah I wondered; it’s not clear whether bitvectors would be more
efficient, for instance, although we could make it perhaps locally
imperative.

> On Wed 05 Feb 2020 17:29, Ludovic Courtès <address@hidden> writes:
>
>>  /* The next three functions (custom_libgmp_*) are passed to
>>     mp_set_memory_functions (in GMP) so that memory used by the digits
>>     themselves is known to the garbage collector.  This is needed so
>> @@ -237,19 +227,20 @@ finalize_bignum (void *ptr, void *data)
>>  static void *
>>  custom_gmp_malloc (size_t alloc_size)
>>  {
>> -  return scm_malloc (alloc_size);
>> +  return scm_gc_malloc (alloc_size, "GMP");
>>  }
>>  
>>  static void *
>>  custom_gmp_realloc (void *old_ptr, size_t old_size, size_t new_size)
>>  {
>> -  return scm_realloc (old_ptr, new_size);
>> +  return scm_gc_realloc (old_ptr, old_size, new_size, "GMP");
>>  }
>>  
>>  static void
>>  custom_gmp_free (void *ptr, size_t size)
>>  {
>> -  free (ptr);
>> +  /* Do nothing: all memory allocated by GMP is under GC control and
>> +     will be freed when needed.  */
>>  }
>
> I think this makes sense to me as a short-term fix.  The down-side is
> that limbs can alias Scheme objects.

Yes.

To my surprise, on a pure bignum microbenchmark, this is
counterproductive:

--8<---------------cut here---------------start------------->8---
$ guile ~/src/guile-debugging/bignum-finalizers.scm  # 3.0.0
clock utime stime cutime cstime gctime
 2.42  6.20  0.17   0.00   0.00   5.62
heap size: 2.0 MiB
$ /data/src/guile-3.0/meta/guile  ~/src/guile-debugging/bignum-finalizers.scm
clock utime stime cutime cstime gctime
 3.97 10.91  0.15   0.00   0.00  10.60
heap size: 3.0 MiB
$ cat ~/src/guile-debugging/bignum-finalizers.scm
(use-modules (ice-9 time))

(time
 (let loop ((n (expt 2 18))
            (i 1))
   (unless (zero? n)
     ;; (display ".")
     (loop (- n 1)
           (logior 0 (ash i 1))))))

(format #t "heap size: ~a MiB~%"
        (round
         (/ (assoc-ref (gc-stats) 'heap-size) (expt 2. 20))))
--8<---------------cut here---------------end--------------->8---

(Here we’re creating ~24 bignums, no more.)
I wonder if there’s another part of the story that I’m missing here.

Perf report for 3.0.0:

--8<---------------cut here---------------start------------->8---
  46.93%  guile    libgc.so.1.3.6         [.] GC_mark_from
  17.61%  guile    libgc.so.1.3.6         [.] GC_header_cache_miss
   9.96%  guile    libgc.so.1.3.6         [.] GC_add_to_black_list_normal
   5.20%  guile    libgmp.so.10.3.2       [.] __gmpn_lshift_coreisbr
   4.13%  guile    libgc.so.1.3.6         [.] GC_find_header
   2.28%  guile    libgc.so.1.3.6         [.] GC_finalize
   2.09%  guile    libgc.so.1.3.6         [.] GC_base
--8<---------------cut here---------------end--------------->8---

With the patch:

--8<---------------cut here---------------start------------->8---
  48.40%  guile            libgc.so.1.3.6         [.] GC_mark_from
  17.74%  guile            libgc.so.1.3.6         [.] GC_header_cache_miss
  11.90%  guile            libgc.so.1.3.6         [.] 
GC_add_to_black_list_normal
   4.45%  guile            libgc.so.1.3.6         [.] GC_find_header
   2.31%  guile            libgmp.so.10.3.2       [.] __gmpn_lshift_coreisbr
   2.30%  guile            libgc.so.1.3.6         [.] GC_base
   1.73%  guile            libgc.so.1.3.6         [.] GC_finalize
--8<---------------cut here---------------end--------------->8---

IOW, the relative part of computations drops from 5% to 2%.

Thoughts?

> In the long-term I think we should be representing bignums as
> pointerless objects whose first word is the tag and a word count,
> followed by inline "limbs" (in the sense of
> https://gmplib.org/manual/Nomenclature-and-Types.html#Nomenclature-and-Types).
> Generally we can use the low-level API to work on these
> (https://gmplib.org/manual/Low_002dlevel-Functions.html#Low_002dlevel-Functions),
> and if we need to use mpz_t, we can easily create an mpz_t that points
> to these values.

Yes, that sounds like the right approach longer-term.  Note that ‘mpz_t’
is exposed through “numbers.h”, which I guess means we cannot change
that in 3.0.x.

Thanks,
Ludo’.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]