guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: May nearly have simple statistical profiler working (need help).


From: Rob Browning
Subject: Re: May nearly have simple statistical profiler working (need help).
Date: 16 Jul 2001 15:56:20 -0500
User-agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7

Neil Jerram <address@hidden> writes:

> OK, but the firing of apply traps requires both `(trap-enable
> 'apply-frame)' and `(debug-enable 'traps)' [SCM_TRAPS_P in C].

Hmm.  I was doing a (debug-enable 'trace) based on some poking around
in the code, but it sounds like (debug-enable 'traps) is more
appropriate.

I also kinda wonder what, if anything, we should do about global
states like traps, etc.  My profiler needs to turn them on and off,
but that means that it'll fight with any other system that needs to do
the same thing.  And unless there's a way I can check their state when
I start, I can't even return the system to the state it was in before
the user fired up the profiler.  I was surprised that (traps) doesn't
return an alist.

> So, once you've done a `(statprof-start)', apply traps will fire for
> any application inside the dynamic scope of the `with-traps' that I
> mentioned before.  And this scope is slightly wider than that of the
> `(start-stack 'repl-stack (primitive-eval sourc))'.  In ASCII art...
> 
>  Guile's stack base   (start-stack 'repl-stack ...)      code position
> /                                 |                             |
> |-------- stack-id #f (A)---------|-- stack-id 'repl-stack (B)--|
> |----- traps disabled ------|------- traps enabled -------------|
> |                           |     :                             |
> |                    (with-traps ...)                           |
> |                           |     :                             |
> |------------(1)------------|-(2)-|------------(3)--------------|
> 
> The profile signal handler can fire with the code position in any of
> regions 1, 2 and 3; the trap handler can fire in regions 2 and 3.
> 
> `(make-stack #t ...)' will always give you either stack A or stack
> B, but _never_ the two joined together.  If you get a stack B --
> i.e. region 3 -- you're OK, since everything in stack B is also
> subject to apply traps.  But if you get a stack A -- regions 1 and 2
> -- some of the stack items that you count come from region 1, where
> traps are not enabled.

Hmm.  I've looked at this for a bit, and I'm still confused.  I'll
give it another go later and see if I can come up with questions
that'll help straighten me out.

I did run in to some unexpected behavior.  When the profile-handler
fired when we were already in the apply-frame handler I had trouble
with disable/enable traps.  Originally, I would just unconditionally
disable the apply-frame trap when we entered the profile-handler, and
then I'd check to see if we were already in the apply-frame handler.
If so, then I wouldn't re-enable the apply-frame trap at the end of
the profile-handler, expecting that when resumed, the caller of the
apply-frame handler would do it (as it normally does).  But what I
found was that this doesn't happen if the profile handler disables
apply-frame when it fires inside the apply-frame handler, the
apply-frame handler stays disabled.  

To fix that (and some other problems), I now just have the
profile-handler set a global to indicate that it's been entered, and
the apply-frame handler checks this to see if it should do anything.
This is a little less efficient, but seems to work well.

Also, I no longer use the mask/unmask critical sections.  I now just
have the profile-handler just optimistically scan the stack, sampling
everything, but if it hits the apply-frame handler, it backtracks,
uncounting everything from there back to the stack base, and then
continues upward normally.

> I'm afraid I'm missing something.  In what way is the trap handler
> part of your profiler non-exact?

I only use the trap handler to count total number of calls, not to
gather any timing information (in fact, I try to exclude the time
spent inside the trap handler from the global accumulated time
timer).  I collect:

  Total time spent profiling (minus time in sampler and apply-frame
  handler).

  Total number of samples taken globally.

  Per-function sample-count.

  Per-function exact (apply-frame) call count.

Given this I can (in theory) determine statistical time spent
per-function per-call, percent time in function per-function, etc.

What I don't do is check the start/end time for every call and
subtract.  This would AFAICT screw up tail recursion and be much more
expensive overall.  Plus, enter/exit frame didn't behave the way I
expected and so it didn't look like this would work easily anyway.

Thanks again.

Here's the current code if you're interested -- it's still not
perfect, but it's getting better.  It doesn't yet take into account
your warnings wrt the stacks above, but I'll try to accomodate those
once I really understand them.


;; Questions:

;; TODO
;;
;; Calculate leaf + parents.
;; Fix reset to take interval info.
;; Make sure start-stop interface is clean.
;; make sure frames we're skipping in PROF handler are OK.

(define-module (ice-9 statprof))

;; This profiler tracks two numbers for every function called while
;; it's active.  It tracks the total number of calls, and the number
;; of times the function was active when the sampler fired.
;;
;; Globally the profiler tracks the total time elapsed and the number
;; of times the sampler was fired.
;;
;; Right now, this profiler is not per-thread and is not thread safe.

(define accumulated-time #f)            ; total so far.
(define last-start-time #f)             ; start-time when timer is active.
(define sample-count #f)                ; total count of sampler calls.
(define sampling-frequency #f)          ; in (seconds . microseconds)
(define remaining-prof-time #f)         ; time remaining when prof suspended.
(define profile-level 0)                ; for user start/stop nesting.

;; function-data will be a weak-key-hash where the key is the function
;; object itself and the value is the data.  The data will be a vector
;; like this: #(name call-count sample-count)
(define function-data #f)

;; If you change the call-data data structure, you need to also change
;; sample-uncount-frame.
(define (make-call-data name call-count sample-count)
  (vector name call-count sample-count))
(define (call-data-name cd) (vector-ref cd 0))
(define (call-data-call-count cd) (vector-ref cd 1))
(define (call-data-sample-count cd) (vector-ref cd 2))
(define (set-call-data-name! cd name) (vector-set! cd 0 name))
(define (set-call-data-call-count! cd val) (vector-set! cd 1 val))
(define (set-call-data-sample-count! cd val) (vector-set! cd 2 val))

(export call-data-name
        call-data-count
        call-data-sample-count)

(define (accumulate-time stop-time)
  (set! accumulated-time
        (+ accumulated-time
           (- (tms:utime stop-time) (tms:utime last-start-time))
           (- (tms:stime stop-time) (tms:stime last-start-time)))))

(define (display-frame frame)
  (simple-format #t "Frame ~A\n" frame)
  (simple-format
   #t
   (string-append "  number: ~A\n"
                  "  source: ~A\n"
                  "  procedure: ~A\n"
                  "  arguments: ~A\n"
                  "  previous: ~A\n"
                  "  next: ~A\n"
                  "  real?: ~A\n"
                  "  procedure? ~A\n"
                  "  frame-evaluating-args?: ~A\n"
                  "  frame-overflow?: ~A\n")
   (frame-number frame)
   (frame-source frame)
   (frame-procedure frame)
   (frame-arguments frame)
   (frame-previous frame)
   (frame-next frame)
   (frame-real? frame)
   (frame-procedure? frame)
   (frame-evaluating-args? frame)
   (frame-overflow? frame)))

;;======================================================================
;; SIGPROF handler

(define (sample-count-frame frame)
  (if (frame-procedure? frame)
      (let* ((frame-proc (frame-procedure frame))
             (proc-data (hashq-ref function-data frame-proc)))
        
        
        (if proc-data
            (set-call-data-sample-count! proc-data
                                         (+ (call-data-sample-count proc-data)
                                            1))
            ;; I think perhaps this is impossible...
            (hashq-set! function-data frame-proc
                        (make-call-data (procedure-name frame-proc)
                                        0
                                        1))))))

(define (sample-uncount-frame frame)
  (if (frame-procedure? frame)
      (let* ((frame-proc (frame-procedure frame))
             (proc-data (hashq-ref function-data frame-proc)))

        (if (not proc-data)
            (let ((str (simple-format
                        #f
                        "Can't possibly happen -- ~A must have been counted!\n"
                        (procedure-name frame-proc))))
              (error str)))
        
        (simple-format #t "Uncounting ~A\n" (procedure-name frame-proc))

        (let ((sample-count (call-data-sample-count proc-data)))
          (if (and (= sample-count 1)
                   (zero? (call-data-call-count proc-data)))
              (hashq-remove! function-data frame-proc)
              (set-call-data-sample-count! proc-data (- sample-count 1)))))))

(define (display-stack-summary stack)
  (let ((stacklen (stack-length stack)))
    ;; Now accumulate stats for the whole stack.
    (display "STACK:\n")
    (let loop ((n 0))
      (if (< n stacklen)
          (let ((frame (stack-ref stack n)))
            (simple-format #t
                           "  Frame ~A: ~A\n"
                           (frame-number frame)
                           (frame-procedure frame))
            (loop (+ n 1)))))))

(define (sample-stack-procs stack)
  (let ((stacklen (stack-length stack))
        ;; where to start profiling. (we need to skip the
        ;; profile funcs themselves).
        (caller-stack-num 2)
        (hit-count-call? #f))
    (if (< caller-stack-num stacklen)
        (begin
          ;; We've got at least one non-profiling frame
          (set! sample-count (+ sample-count 1))
          
          ;; Now accumulate stats for the whole stack.
          (let loop ((n caller-stack-num))
            (if (< n stacklen)
                (let ((frame (stack-ref stack n)))

                  (if (and (frame-procedure? frame)
                           (eq? (frame-procedure frame) count-call))
                      ;; Trouble -- we have to uncount all the frames
                      ;; more inner than this one and ignore this one
                      ;; because we're not supposed to be sampling
                      ;; count-call and its sub-functions.
                      (begin
                        (set! hit-count-call? #t)
                        (display "UNCOUNTING =====\n")
                        (do ((i (- n 1) (- i 1)))
                            ((<= i caller-stack-num))
                          (sample-uncount-frame (stack-ref stack i))))
                      (sample-count-frame frame))
                  (loop (+ n 1)))))))
    hit-count-call?))

(define (stack-proc-pos stack proc)
  (let ((stacklen (stack-length stack)))
    (let loop ((n 0))
      (if (< n stacklen)
          (let ((frame (stack-ref stack n)))
            (or (and (frame-procedure? frame)
                     (eq? (frame-procedure frame) proc)
                     n)
                (loop (+ n 1))))
          #f))))

(define inside-profiler? #f)

(define (profile-signal-handler sig)
  (set! inside-profiler? #t)
  ;;(trap-disable 'apply-frame)           ; unconditionally.

  (if (positive? profile-level)
      (let* ((stop-time (times))
             (inside-apply-trap? (sample-stack-procs (make-stack #t))))

        (if (not inside-apply-trap?)
            (begin
              ;; disabling here is just a little more efficient, but
              ;; not necessary given inside-profiler?.  We can't just
              ;; disable unconditionally at the top of this function
              ;; and eliminate inside-profiler? because it seems to
              ;; confuse guile wrt re-enabling the trap when
              ;; count-call finishes.
              (trap-disable 'apply-frame)
              (accumulate-time stop-time)))
        
        (setitimer ITIMER_PROF
                   0 0
                   (car sampling-frequency)
                   (cdr sampling-frequency))
        
        (if (not inside-apply-trap?)
            (begin
              (set! last-start-time (times))
              (trap-enable 'apply-frame)
              ))))
  (set! inside-profiler? #f))

;;======================================================================
;; Count total calls.

(define (count-call trap-name continuation tail)
  (if (not inside-profiler?)
      (begin
        (accumulate-time (times))
        
        (let ((frame (last-stack-frame continuation)))
          (if (frame-procedure? frame)
              (let* ((frame-proc (frame-procedure frame))
                     (proc-data (hashq-ref function-data frame-proc)))
                (if proc-data
                    (set-call-data-call-count! proc-data
                                               (+ (call-data-call-count 
proc-data)
                                                  1))
                    (hashq-set! function-data frame-proc
                                (make-call-data (procedure-name frame-proc)
                                                1
                                                0))))))
        
        (set! last-start-time (times)))))


;;======================================================================

(define (statprof-active?) (positive? profile-level))

;; Do not call this from statprof internal functions -- user only.
(define (statprof-start)
  ;; After some head-scratching, I don't *think* I need to mask/unmask
  ;; signals here, but if I'm wrong, please let me know.
  (set! profile-level (+ profile-level 1))
  (if (= profile-level 1)
      (let* ((rpt remaining-prof-time)
             (use-rpt? (and rpt
                            (or (positive? (car rpt))
                                (positive? (cdr rpt))))))
        (set! remaining-prof-time #f)
        (set! last-start-time (times))
        (if use-rpt?
            (setitimer ITIMER_PROF 0 0 (car rpt) (cdr rpt))
            (setitimer ITIMER_PROF
                       0 0
                       (car sampling-frequency)
                       (cdr sampling-frequency)))
        (trap-enable 'apply-frame)
        #t)))
  
;; Do not call this from statprof internal functions -- user only.
(define (statprof-stop)
  ;; After some head-scratching, I don't *think* I need to mask/unmask
  ;; signals here, but if I'm wrong, please let me know.
  (set! profile-level (- profile-level 1))
  (if (zero? profile-level)
      (begin
        (trap-disable 'apply-frame)
        ;; I believe that we need to do this before getting the time
        ;; (unless we want to make things even more complicated).
        (set! remaining-prof-time (setitimer ITIMER_PROF 0 0 0 0))
        (accumulate-time (times))
        (set! last-start-time #f))))

(define (statprof-reset sample-seconds sample-microseconds)
  (if (not (zero? profile-level))
      (error "Can't reset profiler while profiler is running."))
  (set! accumulated-time 0)
  (set! last-start-time #f)
  (set! sample-count 0)
  (set! sampling-frequency (cons sample-seconds sample-microseconds))
  (set! remaining-prof-time #f)
  (set! function-data (make-weak-key-hash-table 131))
  (trap-set! apply-frame-handler count-call)
  (debug-enable 'traps)
  (sigaction SIGPROF profile-signal-handler)
  #t)

(define (statprof-map-called func)
  ;; func should take one arg, call-data.  Note that a
  ;; given proc-name may appear multiple times, but if it does, it
  ;; represents different functions with the same name.
  (hash-fold
   (lambda (key value prior-result)
     (func value))
   #f
   function-data))

(define (statprof-display)
  (statprof-map-called
   (lambda (data)
     (simple-format #t
                    "==[~A ~A ~A]\n"
                    (call-data-name data)
                    (call-data-call-count data)
                    (call-data-sample-count data))))
  (simple-format #t "Total time: ~A\n" accumulated-time)
  (simple-format #t "Sample count: ~A\n" sample-count))

(define (statprof-display-anomolies)
  (statprof-map-called
   (lambda (data)
     (if (and (zero? (call-data-call-count data))
              (positive? (call-data-sample-count data)))
         (simple-format #t
                        "==[~A ~A ~A]\n"
                        (call-data-name data)
                        (call-data-call-count data)
                        (call-data-sample-count data)))))
  (simple-format #t "Total time: ~A\n" accumulated-time)
  (simple-format #t "Sample count: ~A\n" sample-count))

(export statprof-display-anomolies)

(define (statprof-accumulated-time)
  (if (positive? profile-level)
      (error "Can't get accumulated time while profiler is running."))
  accumulated-time)

(define (test-func)
  (simple-format #t "Starting test.\n")
  (simple-format #t "Starting test.\n")
  (simple-format #t "Starting test.\n")
  (let loop ((x 1000))
    (if (positive? x)
        (loop (- x 1))))
  (display (traps)) (newline)
  (display (hashq-ref function-data simple-format)) (newline)
  (traps 'help)
  (simple-format #t "Stopping test.\n")
  (simple-format #t "Stopping test.\n")
  #t)

(define (statprof-test)
  (statprof-reset 0 25000)
  (statprof-start)
  (test-func)
  (statprof-stop)
  (statprof-display)
  )

(export statprof-test)

(export statprof-active?
        statprof-start
        statprof-stop
        statprof-reset
        statprof-map-called
        statprof-accumulated-time
        statprof-display)


-- 
Rob Browning
rlb @defaultvalue.org, @linuxdevel.com, and @debian.org
Previously @cs.utexas.edu
GPG=1C58 8B2C FB5E 3F64 EA5C  64AE 78FE E5FE F0CB A0AD



reply via email to

[Prev in Thread] Current Thread [Next in Thread]