[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
01/06: Instrument resource pool checkout failures
From: |
Christopher Baines |
Subject: |
01/06: Instrument resource pool checkout failures |
Date: |
Thu, 7 Mar 2024 05:08:37 -0500 (EST) |
cbaines pushed a commit to branch master
in repository data-service.
commit 76712e2b007d502eba51bfcc122f6f4acfe7ae10
Author: Christopher Baines <mail@cbaines.net>
AuthorDate: Sat Feb 10 10:11:29 2024 +0000
Instrument resource pool checkout failures
As I've got no idea why the resource pools sometimes stop working.
---
guix-data-service/utils.scm | 22 ++++++++++++++++------
guix-data-service/web/controller.scm | 7 ++++++-
2 files changed, 22 insertions(+), 7 deletions(-)
diff --git a/guix-data-service/utils.scm b/guix-data-service/utils.scm
index 5d7d4ba..2f7ac60 100644
--- a/guix-data-service/utils.scm
+++ b/guix-data-service/utils.scm
@@ -123,7 +123,8 @@
(sleep 5)
(destructor/safe args)))))
- (let ((channel (make-channel)))
+ (let ((channel (make-channel))
+ (checkout-failure-count 0))
(spawn-fiber
(lambda ()
(while #t
@@ -164,8 +165,12 @@
(wrap-operation
(put-operation reply new-resource)
(const #t))
- (wrap-operation (sleep-operation 0.2)
+ (wrap-operation (sleep-operation 1)
(const #f))))))
+ (unless checkout-success?
+ (set! checkout-failure-count
+ (+ 1 checkout-failure-count)))
+
(loop (cons new-resource resources)
(if checkout-success?
available
@@ -183,8 +188,12 @@
(wrap-operation
(put-operation reply (car available))
(const #t))
- (wrap-operation (sleep-operation 0.2)
+ (wrap-operation (sleep-operation 1)
(const #f))))))
+ (unless checkout-success?
+ (set! checkout-failure-count
+ (+ 1 checkout-failure-count)))
+
(if checkout-success?
(loop resources
(cdr available)
@@ -222,9 +231,10 @@
resources-last-used)))
(('stats reply)
(let ((stats
- `((resources . ,(length resources))
- (available . ,(length available))
- (waiters . ,(length waiters)))))
+ `((resources . ,(length resources))
+ (available . ,(length available))
+ (waiters . ,(length waiters))
+ (checkout-failure-count .
,checkout-failure-count))))
(perform-operation
(choice-operation
diff --git a/guix-data-service/web/controller.scm
b/guix-data-service/web/controller.scm
index de7ba7c..1ac081b 100644
--- a/guix-data-service/web/controller.scm
+++ b/guix-data-service/web/controller.scm
@@ -190,7 +190,12 @@
(waiters . ,(make-gauge-metric
registry
"resource_pool_waiters_total"
- #:labels '(pool_name)))))
+ #:labels '(pool_name)))
+ (checkout-failure-count
+ . ,(make-gauge-metric
+ registry
+ "resource_pool_checkout_failures_total"
+ #:labels '(pool_name)))))
(gc-metrics-updater
(get-gc-metrics-updater registry))
- branch master updated (26f517d -> 5f80856), Christopher Baines, 2024/03/07
- 02/06: Copy the port monitoring fiber from the build coordinator, Christopher Baines, 2024/03/07
- 04/06: Add exception handling to the process-jobs script, Christopher Baines, 2024/03/07
- 01/06: Instrument resource pool checkout failures,
Christopher Baines <=
- 05/06: Report the size of the guix database and WAL file, Christopher Baines, 2024/03/07
- 03/06: Further fix handling exceptions when computing channel instances, Christopher Baines, 2024/03/07
- 06/06: Allow prioritising jobs from specific repositories, Christopher Baines, 2024/03/07