guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] read-response-body should return received data when any brea


From: Ian Price
Subject: Re: [PATCH] read-response-body should return received data when any break happens
Date: Thu, 15 Mar 2012 18:31:52 +0000
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux)

Nala Ginrut <address@hidden> writes:

> I've been troubled with a weird problem in read-response-body for a long time.
> I think read-response-body never return the received data when any break 
> happens. No matter the break caused by connection problem or user 
> interruption.
> The only possible read-response-body returns data is connection never down 
> and all the data have been received even if I want to download a 2G file. Or 
> there's no
> chance to write any data to the disk. When break occurs, all the received 
> data will evaporate.
> Considering my terrible network, I decide not to pray for good luck when I 
> establish connection with our web module.
> So here's a patch to fix it.
I don't think read-response-body is a good choice for downloading a 2GB
file in any case. Just as you wouldn't read a 2GB sorted file into
memory to perform a binary search, preferring to search directly on the
disk itself, I think people with expectations of treating large files
should write an appropriate handler for themselves using the lower
building blocks guile already provides.

> The new read-response-body will add the received data to the exceptional 
> information which used by "throw", if read-response-body can't continue to 
> work anymore, the received
> data will return with throw.
Yes, providing it at the point of error is best. The data is already
available to us and it won't affect normal returns.

This last part is particularly important to me, as I mentioned on IRC,
since programmers should not have to do anything special in the normal case.

> -(define (read-response-body r)
> +(define* (read-response-body r #:key (block 4096))
>    "Reads the response body from @var{r}, as a bytevector.  Returns
>  @code{#f} if there was no response body."
> -  (let ((nbytes (response-content-length r)))
> -    (and nbytes
> -         (let ((bv (get-bytevector-n (response-port r) nbytes)))
> -           (if (= (bytevector-length bv) nbytes)
> -               bv
> -               (bad-response "EOF while reading response body: ~a bytes of 
> ~a"
> -                            (bytevector-length bv) nbytes))))))
> +  (let* ((nbytes (response-content-length r))
> +         (bv (and nbytes (make-bytevector nbytes)))
> +         (start 0))
> +    (catch #t
> +           (lambda ()
> +             (let lp((buf (get-bytevector-n (response-port r) block)))
> +                 (if (eof-object? buf)
> +                     bv
> +                     (let ((len (bytevector-length buf)))
> +                       (cond
> +                        ((<= len block)
> +                         (bytevector-copy! buf 0 bv start len)
> +                         (set! start (+ start len))
> +                         (lp (get-bytevector-n (response-port r) block)))
> +                        (else
> +                         (bad-response "EOF while reading response body: ~a 
> bytes of ~a"
> +                                       start nbytes)))))))
The manual buffering is superfluous. get-bytevector-n already does this
under the hood, and much more efficiently. Even if it didn't, reading in
smaller chunks is only a win if you are also processing it in chunks,
since large blocks can make the gc interfere more often.

The only conceivable difference I could see is that you are choosing to
return an eof object rather of erroring. I'm not convinced of the
utility of that: #f or #vu8() I could understand, eof less so.

> +             (lambda (k . e)
> +               (let ((received (call-with-port 
> +                                (open-bytevector-input-port bv) 
> +                                (lambda (port)
> +                                  (get-bytevector-n port start)))))
> +                 (throw k `(,@e (body ,@received))) ;; return the received 
> data
> +                 )))))
> +
yuck. This consing a body symbol is hideous IMO. From a programmatic
point of view it is unnecessary, and all it's adding is requiring the
user to perform an extra cadr. It would be much better to choose a
different key to throw to, rather than doing it this way.

> +;; output the received data if there is, or do nothing
> +(define (output-received-response-body e port)
> +  (let ((received (assoc-ref (cadr e) 'body)))
> +    (if received
> +        (begin
> +          (put-bytevector port received) 
> +          (force-output port)))))
> +   
> +;; Exceptional information contains the received bytevector added from the
> +;; read-response-body if any exception had been caught.
> +;; If received data ware huge(it always does), it'd be a trouble during the 
> tracing.
> +;; This helper function could get rid of the received data from exceptional 
> info,
> +;; and re-throw it.
> +(define (throw-from-response-body-break e)
> +  (throw (car e) (list-head (cdr e) (1- (length (cdr e))))))
>  
>  (define (write-response-body r bv)
>    "Write @var{bv}, a bytevector, to the port corresponding to the HTTP

I know I don't have any sort of authority, but I'd like to veto
these. Special procedures to deal with an ad-hoc protocol layered over
another more appropriate protocol is just ugly. As I already mentioned,
exceptions have tags, and this is what these are for.

Fundamentally, I think this patch could be simplified to checking for an
eof from get-bytevector-n and changing the bad-response to an
"incomplete-response" that provides the bytevector.

What does everyone else think?

-- 
Ian Price

"Programming is like pinball. The reward for doing it well is
the opportunity to do it again" - from "The Wizardy Compiled"



reply via email to

[Prev in Thread] Current Thread [Next in Thread]