gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gcl-devel] Fwd: secret of rat3a/ctimes with gcl ?


From: Camm Maguire
Subject: Re: [Gcl-devel] Fwd: secret of rat3a/ctimes with gcl ?
Date: Wed, 20 Jun 2012 18:08:06 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

Greetings, and thanks!

Here are the definitions I'm examining now:

(progn
  
  (push '((fixnum fixnum) fixnum #.(compiler::flags compiler::rfa) 
          "(fixnum)(((long long)(#0))%((long long)(#1)))") (get 'i% 
'compiler::inline-always))
  (push '((fixnum fixnum fixnum) fixnum #.(compiler::flags compiler::rfa) 
          "(fixnum)(((long long)(#0))*((long long)(#1))%((long long)(#2)))") 
(get '*% 'compiler::inline-always))
  (push '((fixnum fixnum fixnum) fixnum #.(compiler::flags compiler::rfa) 
          "(fixnum)(((long long)(#0))+((long long)(#1))%((long long)(#2)))") 
(get '+% 'compiler::inline-always))
  (push '((fixnum fixnum fixnum) fixnum #.(compiler::flags compiler::rfa) 
          "(fixnum)(((long long)(#0))-((long long)(#1))%((long long)(#2)))") 
(get '-% 'compiler::inline-always))
  (setf (get 'i%  'compiler::return-type) t)
  (setf (get '*% 'compiler::return-type) t)
  (setf (get '+% 'compiler::return-type) t)
  (setf (get '-% 'compiler::return-type) t)
  
  (defmacro mcf (f a &optional b &aux (z (assoc f '((identity 64 i%)(* 32 *%)(+ 
63 +%)(- 63 -%))))(tp (cadr z))(ff (caddr z)))
    `(if ,(if b `(typep modulus ',(if (< (integer-length most-positive-fixnum) 
tp) `fixnum `(signed-byte ,tp)))
            `(and (typep modulus 'fixnum) (typep ,a 'fixnum)))
         (let ((x ,a),@(when b `((y ,b)))(z modulus))
           (declare (fixnum x ,@(when b `(y)) z))
           (let ((k (,ff x ,@(when b `(y)) z))(q (ash z -1)))
             (declare (fixnum k q))
             (if (> k q) (the fixnum (- k z)) (if (< k (- q)) (the fixnum (+ k 
z)) k))))
       (let ((k ,@(if b `((,f ,a ,b)) `(,a))))
         (if modulus (let ((q (ash modulus -1))(k (mod k modulus))) (if (> k q) 
(- k modulus) (if (< k (- q)) (+ k modulus) k))) k))))

  (si::define-compiler-macro ctimes      (a b) `(mcf * ,a ,b))
  (si::define-compiler-macro cplus       (a b) `(mcf + ,a ,b))
  (si::define-compiler-macro cdifference (a b) `(mcf - ,a ,b))
  (si::define-compiler-macro cmod        (a)   `(mcf identity ,a))
)


The unsigned mod version of mcf would be

 (defmacro mcf (f a &optional b &aux (z (assoc f '((identity 64 i%)(* 32 *%)(+ 
63 +%)(- 63 -%))))(tp (cadr z))(ff (caddr z)))
    `(if ,(if b `(typep modulus ',(if (< (integer-length most-positive-fixnum) 
tp) `fixnum `(signed-byte ,tp)))
            `(and (typep modulus 'fixnum) (typep ,a 'fixnum)))
         (let ((x ,a),@(when b `((y ,b)))(z modulus))
           (declare (fixnum x ,@(when b `(y)) z))
           (,ff x ,@(when b `(y)) z))
       (let ((k ,@(if b `((,f ,a ,b)) `(,a))))
         (if modulus (mod k modulus) k))))

These speed up your routines nicely.  They don't appear to noticeably
impact the maxima testsuite runtimes.

I suppose the 'balanced mod' implementation is to control the growth of
the integers?  What is unclear to me is that the #-kcl version in
rat3a.lisp only checks the > mod/2 branch, whereas this as well as the <
-mod/2 appear necessary on my system.

Anyway, let me know if this helps.

Take care,

Volker van Nek <address@hidden> writes:

> Just
> modulus : 7 $
>
> My polynomial '(13 1 12 2 11 3 10 4 0 5) for testing had coefficients smaller 
> than 7.
>
> Greetings
> Volker
>
> 2012/6/14 Camm Maguire <address@hidden>
>
>     Greetings, and thanks!  What are you using for the modulus setting in
>     your timings?
>    
>     Volker van Nek <address@hidden> writes:
>    
>     > Hi Camm,
>     >
>     > I attach a file with my versions psq1 and psq2 and with ptimes1 from 
> rat3a.lisp and some functions from rat3a.lisp and optimize.lisp which
>     are
>     > envolved in running ptimes1 and ctimes.
>     >
>     > Greetings
>     > Volker
>     >
>     > 2012/6/14 Camm Maguire <address@hidden>
>     >
>     >     Greetings!  Could you please post the source to psq1, psq1, and 
> ptimes1?  Thanks!
>     >
>     >     Volker van Nek <address@hidden> writes:
>     >
>     >     > Hi Camm. Greetings!
>     >     >
>     >     > Here at work I have a (damned slow) Computer which runs Maxima on 
> sbcl and I also performed a timing test with psq1, psq2 and rat3a/
>     ptimes1.
>     >     >
>     >     > (%i4) :lisp (time (dotimes (i 100000) (psq1 '(13 1 12 2 11 3 10 4 
> 0 5))))
>     >     >
>     >     > Evaluation took:
>     >     >   0.460 seconds of real time
>     >     >   0.428026 seconds of total run time (0.424026 user, 0.004000 
> system)
>     >     >   [ Run times consist of 0.016 seconds GC time, and 0.413 seconds 
> non-GC time. ]
>     >     >   93.04% CPU
>     >     >   1,103,390,864 processor cycles
>     >     >   35,201,008 bytes consed
>     >     >  
>     >     > NIL
>     >     > (%i4) :lisp (time (dotimes (i 100000) (psq2 '(13 1 12 2 11 3 10 4 
> 0 5))))
>     >     >
>     >     > Evaluation took:
>     >     >   0.301 seconds of real time
>     >     >   0.264016 seconds of total run time (0.264016 user, 0.000000 
> system)
>     >     >   [ Run times consist of 0.012 seconds GC time, and 0.253 seconds 
> non-GC time. ]
>     >     >   87.71% CPU
>     >     >   723,411,304 processor cycles
>     >     >   35,198,904 bytes consed
>     >     >  
>     >     > NIL
>     >     > (%i4) :lisp (time (dotimes (i 100000) (ptimes1 '(13 1 12 2 11 3 
> 10 4 0 5) '(13 1 12 2 11 3 10 4 0 5))))
>     >     >
>     >     > Evaluation took:
>     >     >   1.050 seconds of real time
>     >     >   1.024065 seconds of total run time (1.020064 user, 0.004001 
> system)
>     >     >   [ Run times consist of 0.004 seconds GC time, and 1.021 seconds 
> non-GC time. ]
>     >     >   97.52% CPU
>     >     >   2,520,188,384 processor cycles
>     >     >   21,600,864 bytes consed
>     >     >  
>     >     > NIL
>     >     >
>     >     > Here the number of processor cycles corresponds to the timing 
> results I expected.
>     >     >
>     >     > psq2 uses just (mod (* a b) modulus) where
>     >     >
>     >     > psq1 uses
>     >     >
>     >     > (cond
>     >     >   ((not (null modulus))
>     >     >     (let ((.n. (mod (* a b) modulus)))
>     >     >       (cond
>     >     >         ((<= (* 2 .n.) modulus) .n.)
>     >     >         (t (- .n. modulus)) )))
>     >     >
>     >     > at the same place, which is a lot more!
>     >     >
>     >     > rat3a/ptimes1 even uses rat3a/ptimes instead of rat3a/ctimes 
> which puts a lot of conditionals on top and it uses an algorithm which
>     consists
>     >     of
>     >     > nearly twice the number of multiplications and additions.
>     >     >
>     >     > How can it happen, that doing twice the work is faster in gcl 
> than just doing it once? sbcl shows the timings I expected. What is
>     different
>     >     with
>     >     > gcl?
>     >     >
>     >     > Volker
>     >     >
>     >     > 2012/6/13 Volker van Nek <address@hidden>
>     >     >
>     >     >     Hi Camm. Greetings!
>     >     >
>     >     >     Thank you for taking your time for this.
>     >     >
>     >     >     I don't expect overflowing the fixnum bounds, the numbers are 
> very small. Using e.g. a modulus of 7 the multiplication is of type
>     3*4.
>     >     >
>     >     >     (setq si::*notify-gbc* t)
>     >     >     I do not understand what the answers want to tell me. I hope 
> you know.
>     >     >
>     >     >     I did some experiments with (declare (integer a b c)), but 
> this did not seem to have influence.
>     >     >
>     >     >     I attached the code in two variants and a session introducing 
> the function and some timing results on my computer.
>     >     >
>     >     >     Thanks in advance.
>     >     >
>     >     >     Volker van Nek
>     >     >
>     >     >     2012/6/13 Camm Maguire <address@hidden>
>     >     >
>     >     >         Greetings!
>     >     >
>     >     >         I'd like to actually see your implementation to make 
> sure.  But my guess
>     >     >         is that the unsigned arithmetic is overflowing the fixnum 
> bounds,
>     >     >         i.e. producing bignums.  You might get an idea by (setq 
> si::*notify-gbc*
>     >     >         t).  Is anything known about modulus lisp-type wise?
>     >     >
>     >     >         Volker van Nek <address@hidden> writes:
>     >     >
>     >     >         > Hi Camm,
>     >     >         >
>     >     >         > there was no answer on the Maxima list to my question. 
> Perhaps you might have an idea.
>     >     >         >
>     >     >         > Thanks in advance
>     >     >         > Volker van Nek
>     >     >         >
>     >     >         > ---------- Forwarded message ----------
>     >     >         > From: Volker van Nek <address@hidden>
>     >     >         > Date: 2012/6/9
>     >     >         > Subject: secret of rat3a/ctimes with gcl ?
>     >     >         > To: address@hidden
>     >     >         >
>     >     >         > I have coded a squaring function for polynomials in 
> finite fields. As expected it is roughly 1.8 times faster than
>     multiplying
>     >     poly*poly
>     >     >         with rat3a
>     >     >         > /ptimes, wich is the underlying lisp function for 
> Maxima level fasttimes.
>     >     >         >
>     >     >         > To multiply the coefficients the function ptimes uses 
> ctimes, which multiplies and reduces the result by a balanced modulus.
>     From
>     >     rat3a:
>     >     >         >
>     >     >         >   (defmacro mcmod (n) ;;; gives answers from -modulus/2 
> ...0 1 2 +modulus/2
>     >     >         >     `(let ((.n. (mod ,n modulus)))
>     >     >         >       (cond ((<= (* 2 .n.) modulus) .n.)
>     >     >         >         (t (- .n. modulus)))))
>     >     >         >
>     >     >         >   (defun ctimes (a b)
>     >     >         >     (cond ((null modulus) (* a b))
>     >     >         >       (t (mcmod (* a b)))))
>     >     >         >
>     >     >         > In my squaring function I want to replace each (ctimes 
> a b) by simply (mod (* a b) modulus). I need positive results.
>     >     >         >
>     >     >         > As expected with clisp and sbcl this replacement is 
> faster than ctimes, but with gcl this replacement (which does less!)
>     slows down
>     >     the
>     >     >         whole
>     >     >         > function by a factor 3.
>     >     >         >
>     >     >         > Is there a hidden secret about ctimes with gcl? Any 
> hints are welcome.
>     >     >         >
>     >     >         > Volker van Nek
>     >     >         >
>     >     >
>     >     >         --
>     >     >         Camm Maguire                                       
> address@hidden
>     >     >         
> ==========================================================================
>     >     >         "The earth is but one country, and mankind its citizens." 
>  --  Baha'u'llah
>     >     >
>     >
>     >     --
>     >     Camm Maguire                                       address@hidden
>     >     
> ==========================================================================
>     >     "The earth is but one country, and mankind its citizens."  --  
> Baha'u'llah
>     >
>     >
>    
>     --
>     Camm Maguire                                       address@hidden
>     ==========================================================================
>     "The earth is but one country, and mankind its citizens."  --  Baha'u'llah
>

-- 
Camm Maguire                                        address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah



reply via email to

[Prev in Thread] Current Thread [Next in Thread]