From: Andreas Rottmann Subject: Add implementation of SRFI 38 * module/srfi/srfi-38.scm: New file, partly based on the reference implementation and on Alex Shinn's public-domain implementation for Chicken. * module/Makefile.am (SRFI_SOURCES): Add srfi/srfi-38.scm. * test-suite/tests/srfi-38.test: New file, minimal test suite for SRFI 38. * test-suite/Makefile.am (SCM_TESTS): Added tests/srfi-38.test. * doc/ref/srfi-modules.texi: Add a node for SRFI 38. --- doc/ref/srfi-modules.texi | 125 +++++++++++++++++++++++++- module/Makefile.am | 1 + module/srfi/srfi-38.scm | 203 +++++++++++++++++++++++++++++++++++++++++ test-suite/Makefile.am | 1 + test-suite/tests/srfi-38.test | 50 ++++++++++ 5 files changed, 379 insertions(+), 1 deletions(-) diff --git a/doc/ref/srfi-modules.texi b/doc/ref/srfi-modules.texi index 238484c..b214483 100644 --- a/doc/ref/srfi-modules.texi +++ b/doc/ref/srfi-modules.texi @@ -42,6 +42,7 @@ get the relevant SRFI documents from the SRFI home page * SRFI-34:: Exception handling. * SRFI-35:: Conditions. * SRFI-37:: args-fold program argument processor +* SRFI-38:: External Representation for Data With Shared Structure * SRFI-39:: Parameter objects * SRFI-42:: Eager comprehensions * SRFI-45:: Primitives for expressing iterative lazy algorithms @@ -3619,7 +3620,6 @@ the user. Return true if @var{c} is of type @code{&error} or one of its subtypes. @end deffn - @node SRFI-37 @subsection SRFI-37 - args-fold @cindex SRFI-37 @@ -3706,6 +3706,129 @@ not named options. This includes arguments after @samp{--}. It is called with the argument in question, as well as the seeds. @end deffn address@hidden SRFI-38 address@hidden SRFI-38 - External Representation for Data With Shared Structure address@hidden SRFI-38 + +This subsection is based on address@hidden://srfi.schemers.org/srfi-38/srfi-38.html, the specification +of SRFI-38} written by Ray Dillinger. + address@hidden Copyright (C) Ray Dillinger 2003. All Rights Reserved. + address@hidden Permission is hereby granted, free of charge, to any person obtaining a address@hidden copy of this software and associated documentation files (the address@hidden "Software"), to deal in the Software without restriction, including address@hidden without limitation the rights to use, copy, modify, merge, publish, address@hidden distribute, sublicense, and/or sell copies of the Software, and to address@hidden permit persons to whom the Software is furnished to do so, subject to address@hidden the following conditions: + address@hidden The above copyright notice and this permission notice shall be included address@hidden in all copies or substantial portions of the Software. + address@hidden THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS address@hidden OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF address@hidden MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND address@hidden NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE address@hidden LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION address@hidden OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION address@hidden WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +This SRFI creates an alternative external representation for data +written and read using @code{write-with-shared-structure} and address@hidden It is identical to the grammar for +external representation for data written and read with @code{write} and address@hidden given in section 7 of R5RS, except that the single +production + address@hidden + --> | address@hidden example + +is replaced by the following five productions: + address@hidden + --> | | + --> #= + --> ## + --> | + --> + address@hidden example + address@hidden {Scheme procedure} write-with-shared-structure obj address@hidden {Scheme procedure} write-with-shared-structure obj port address@hidden {Scheme procedure} write-with-shared-structure obj port optarg + +Writes an external representation of @var{obj} to the given port. +Strings that appear in the written representation are enclosed in +doublequotes, and within those strings backslash and doublequote +characters are escaped by backslashes. Character objects are written +using the @code{#\} notation. + +Objects which denote locations rather than values (cons cells, vectors, +and non-zero-length strings in R5RS scheme; also Guile's structs, +bytevectors and ports and hash-tables), if they appear at more than one +point in the data being written, are preceded by @address@hidden the +first time they are written and replaced by @address@hidden all +subsequent times they are written, where @var{N} is a natural number +used to identify that particular object. If objects which denote +locations occur only once in the structure, then address@hidden must produce the same external +representation for those objects as @code{write}. + address@hidden terminates in finite time and +produces a finite representation when writing finite data. + address@hidden returns an unspecified value. The address@hidden argument may be omitted, in which case it defaults to the +value returned by @code{(current-output-port)}. The @var{optarg} +argument may also be omitted. If present, its effects on the output and +return value are unspecified but @code{write-with-shared-structure} must +still write a representation that can be read by address@hidden Some implementations may wish to use address@hidden to specify formatting conventions, numeric radixes, or +return values. Guile's implementation ignores @var{optarg}. + +For example, the code + address@hidden +(begin (define a (cons 'val1 'val2)) + (set-cdr! a a) + (write-with-shared-structure a)) address@hidden lisp + +should produce the output @code{#1=(val1 . #1#)}. This shows a cons +cell whose @code{cdr} contains itself. + address@hidden deffn + address@hidden {Scheme procedure} read-with-shared-structure address@hidden {Scheme procedure} read-with-shared-structure port + address@hidden converts the external representations +of Scheme objects produced by @code{write-with-shared-structure} into +Scheme objects. That is, it is a parser for the nonterminal address@hidden} in the augmented external representation grammar defined +above. @code{read-with-shared-structure} returns the next object +parsable from the given input port, updating @var{port} to point to the +first character past the end of the external representation of the +object. + +If an end-of-file is encountered in the input before any characters are +found that can begin an object, then an end-of-file object is returned. +The port remains open, and further attempts to read it (by address@hidden or @code{read} will also return an +end-of-file object. If an end of file is encountered after the +beginning of an object's external representation, but the external +representation is incomplete and therefore not parsable, an error is +signalled. + +The @var{port} argument may be omitted, in which case it defaults to the +value returned by @code{(current-input-port)}. It is an error to read +from a closed port. + address@hidden deffn @node SRFI-39 @subsection SRFI-39 - Parameters diff --git a/module/Makefile.am b/module/Makefile.am index 8086d82..b86123f 100644 --- a/module/Makefile.am +++ b/module/Makefile.am @@ -254,6 +254,7 @@ SRFI_SOURCES = \ srfi/srfi-34.scm \ srfi/srfi-35.scm \ srfi/srfi-37.scm \ + srfi/srfi-38.scm \ srfi/srfi-42.scm \ srfi/srfi-39.scm \ srfi/srfi-45.scm \ diff --git a/module/srfi/srfi-38.scm b/module/srfi/srfi-38.scm new file mode 100644 index 0000000..71eaff0 --- /dev/null +++ b/module/srfi/srfi-38.scm @@ -0,0 +1,203 @@ +;; Copyright (C) Andreas Rottmann 2010. +;; Copyright (C) Ray Dillinger 2003. All Rights Reserved. + +;; Permission is hereby granted, free of charge, to any person obtaining +;; a copy of this software and associated documentation files (the +;; "Software"), to deal in the Software without restriction, including +;; without limitation the rights to use, copy, modify, merge, publish, +;; distribute, sublicense, and/or sell copies of the Software, and to +;; permit persons to whom the Software is furnished to do so, subject to +;; the following conditions: + +;; The above copyright notice and this permission notice shall be +;; included in all copies or substantial portions of the Software. + +;; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, +;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS +;; BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN +;; ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN +;; CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +;; SOFTWARE. + +(define-module (srfi srfi-38) + #:export (write-with-shared-structure + read-with-shared-structure) + #:use-module (rnrs bytevectors) + #:use-module (srfi srfi-8) + #:use-module (srfi srfi-69) + #:use-module (system vm trap-state)) + + +;; A printer that shows all sharing of substructures. Uses the Common +;; Lisp print-circle notation: #n# refers to a previous substructure +;; labeled with #n=. Takes O(n^2) time. + +;; Code attributed to Al Petrofsky, modified by Ray Dillinger. + +;; Modified in 2010 by Andreas Rottmann to use SRFI 69 hashtables, +;; making the time O(n), and adding some of Guile's data types to the +;; `interesting' objects. + +(define* (write-with-shared-structure obj + #:optional + (outport (current-output-port)) + (optarg #f)) + + ;; We only track duplicates of pairs, vectors, strings, bytevectors, + ;; structs (which subsume R6RS and SRFI-9 records), ports and (native) + ;; hash-tables. We ignore zero-length vectors and strings because + ;; r5rs doesn't guarantee that eq? treats them sanely (and they aren't + ;; very interesting anyway). + + (define (interesting? obj) + (or (pair? obj) + (and (vector? obj) (not (zero? (vector-length obj)))) + (and (string? obj) (not (zero? (string-length obj)))) + (bytevector? obj) + (struct? obj) + (port? obj) + (hash-table? obj))) + + ;; (write-obj OBJ STATE): + ;; + ;; STATE is a hashtable which has an entry for each interesting part + ;; of OBJ. The associated value will be: + ;; + ;; -- a number if the part has been given one, + ;; -- #t if the part will need to be assigned a number but has not been yet, + ;; -- #f if the part will not need a number. + ;; The entry `counter' in STATE should be the most recently + ;; assigned number. + ;; + ;; Mutates STATE for any parts that had numbers assigned. + (define (write-obj obj state) + (define (write-interesting) + (cond ((pair? obj) + (display "(" outport) + (write-obj (car obj) state) + (let write-cdr ((obj (cdr obj))) + (cond ((and (pair? obj) (not (hash-table-ref state obj))) + (display " " outport) + (write-obj (car obj) state) + (write-cdr (cdr obj))) + ((null? obj) + (display ")" outport)) + (else + (display " . " outport) + (write-obj obj state) + (display ")" outport))))) + ((vector? obj) + (display "#(" outport) + (let ((len (vector-length obj))) + (write-obj (vector-ref obj 0) state) + (let write-vec ((i 1)) + (cond ((= i len) (display ")" outport)) + (else (display " " outport) + (write-obj (vector-ref obj i) state) + (write-vec (+ i 1))))))) + ;; else it's a string + (else (write obj outport)))) + (cond ((interesting? obj) + (let ((val (hash-table-ref state obj))) + (cond ((not val) (write-interesting)) + ((number? val) + (begin (display "#" outport) + (write val outport) + (display "#" outport))) + (else + (let ((n (+ 1 (hash-table-ref state 'counter)))) + (display "#" outport) + (write n outport) + (display "=" outport) + (hash-table-set! state 'counter n) + (hash-table-set! state obj n) + (write-interesting)))))) + (else + (write obj outport)))) + + ;; Scan computes the initial value of the hash table, which maps each + ;; interesting part of the object to #t if it occurs multiple times, + ;; #f if only once. + (define (scan obj state) + (cond ((not (interesting? obj))) + ((hash-table-exists? state obj) + (hash-table-set! state obj #t)) + (else + (hash-table-set! state obj #f) + (cond ((pair? obj) + (scan (car obj) state) + (scan (cdr obj) state)) + ((vector? obj) + (let ((len (vector-length obj))) + (do ((i 0 (+ 1 i))) + ((= i len)) + (scan (vector-ref obj i) state)))))))) + + (let ((state (make-hash-table eq?))) + (scan obj state) + (hash-table-set! state 'counter 0) + (write-obj obj state))) + +;; A reader that understands the output of the above writer. This has +;; been written by Andreas Rottmann to re-use Guile's built-in reader, +;; with inspiration from Alex Shinn's public-domain implementation of +;; `read-with-shared-structure' found in Chicken's SRFI 38 egg. + +(define* (read-with-shared-structure #:optional (port (current-input-port))) + (let ((parts-table (make-hash-table eqv?))) + + ;; reads chars that match PRED and returns them as a string. + (define (read-some-chars pred initial) + (let iter ((chars initial)) + (let ((c (peek-char port))) + (if (or (eof-object? c) (not (pred c))) + (list->string (reverse chars)) + (iter (cons (read-char port) chars)))))) + + (define (read-hash c port) + (let* ((n (string->number (read-some-chars char-numeric? (list c)))) + (c (read-char port)) + (thunk (hash-table-ref/default parts-table n #f))) + (case c + ((#\=) + (if thunk + (error "Double declaration of part " n)) + (let* ((cell (list #f)) + (thunk (lambda () (car cell)))) + (hash-table-set! parts-table n thunk) + (let ((obj (read port))) + (set-car! cell obj) + obj))) + ((#\#) + (or thunk + (error "Use of undeclared part " n))) + (else + (error "Malformed shared part specifier"))))) + + (with-fluid* %read-hash-procedures (fluid-ref %read-hash-procedures) + (lambda () + (for-each (lambda (digit) + (read-hash-extend digit read-hash)) + '(#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9)) + (let ((result (read port))) + (if (< 0 (hash-table-size parts-table)) + (patch! result)) + result))))) + +(define (hole? x) (procedure? x)) +(define (fill-hole x) (if (hole? x) (fill-hole (x)) x)) + +(define (patch! x) + (cond + ((pair? x) + (if (hole? (car x)) (set-car! x (fill-hole (car x))) (patch! (car x))) + (if (hole? (cdr x)) (set-cdr! x (fill-hole (cdr x))) (patch! (cdr x)))) + ((vector? x) + (do ((i (- (vector-length x) 1) (- i 1))) + ((< i 0)) + (let ((elt (vector-ref x i))) + (if (hole? elt) + (vector-set! x i (fill-hole elt)) + (patch! elt))))))) diff --git a/test-suite/Makefile.am b/test-suite/Makefile.am index a76553b..0fe9c85 100644 --- a/test-suite/Makefile.am +++ b/test-suite/Makefile.am @@ -118,6 +118,7 @@ SCM_TESTS = tests/00-initial-env.test \ tests/srfi-34.test \ tests/srfi-35.test \ tests/srfi-37.test \ + tests/srfi-38.test \ tests/srfi-39.test \ tests/srfi-42.test \ tests/srfi-45.test \ diff --git a/test-suite/tests/srfi-38.test b/test-suite/tests/srfi-38.test new file mode 100644 index 0000000..56d8b87 --- /dev/null +++ b/test-suite/tests/srfi-38.test @@ -0,0 +1,50 @@ +;; -*- scheme -*- + +(define-module (test-srfi-38) + #:use-module (test-suite lib) + #:use-module (srfi srfi-38) + #:use-module (rnrs bytevectors)) + +(define (shared-structure->string object) + (call-with-output-string + (lambda (port) + (write-with-shared-structure object port)))) + +(define (roundtrip object) + (call-with-input-string (shared-structure->string object) + (lambda (port) + (read-with-shared-structure port)))) + +(with-test-prefix "pairs" + (let ((foo (cons 'value-1 #f))) + (set-cdr! foo foo) + (pass-if "writing" + (string=? "#1=(value-1 . #1#)" + (shared-structure->string foo))) + (pass-if "roundtrip" + (let ((result (roundtrip foo))) + (and (pair? result) + (eq? (car result) 'value-1) + (eq? (cdr result) result)))))) + +(with-test-prefix "bytevectors" + (let ((vec (vector 0 1 2 3)) + (bv (u8-list->bytevector '(42 42)))) + (vector-set! vec 0 bv) + (vector-set! vec 2 bv) + (pass-if "roundtrip" + (let ((result (roundtrip vec))) + (and (equal? '#(#vu8(42 42) 1 #vu8(42 42) 3) + result) + (eq? (vector-ref result 0) + (vector-ref result 2))))))) + +(with-test-prefix "mixed" + (let* ((pair (cons 'a 'b)) + (vec (vector 0 pair 2 pair #f))) + (vector-set! vec 4 vec) + (pass-if "roundtrip" + (let ((result (roundtrip vec))) + (and (eq? (vector-ref result 1) + (vector-ref result 3)) + (eq? result (vector-ref result 4))))))) -- tg: (60db12d..) t/srfi-38 (depends on: master t/read-hash-fluid)