bug-guile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24075: tls/https support in Guile (through r6rs binary ports?)


From: Christopher Allan Webber
Subject: bug#24075: tls/https support in Guile (through r6rs binary ports?)
Date: Sun, 06 Nov 2016 12:32:38 -0600
User-agent: mu4e 0.9.16; emacs 25.1.1

Some less good news: I found out that the https stuff is not working
right for all sites.  I tested though... the code works *before* I
wrapped it in custom-binary-input/output-port.

After being wrapped though, strange things happen.  For some sites (eg
"https://webmention.net/";) things seem fine:

scheme@(guile-user)> (http-get (string->uri "https://webmention.net/";))
$7 = #<<response> version: (1 . 1) code: 200 reason-phrase: "OK" headers: 
((server . "nginx/1.9.10") (date . #<date nanosecond: 0 second: 46 minute: 15 
hour: 18 day: 6 month: 11 year: 2016 zone-offset: 0>) (content-type text/html 
(charset . "UTF-8")) (transfer-encoding (chunked)) (connection close) 
(x-powered-by . "PHP/5.6.21-1+donate.sury.org~trusty+4")) port: #<closed: file 
f25310>>
$8 = "<!DOCTYPE html>\n<html>\n<head>\n  <title>Webmention</title>\n  <link 
rel=\"stylesheet\" href=\"/styles.css\">\n</head>\n<body>\n\n<div 
class=\"page\">\n  \n  <h1>Webmention</h1>\n  \n  <div 
class=\"subtitle\">Webmention is a simple way to notify any URL when you link 
to it from your site.</div>\n  \n  <ul class=\"links\">\n    <li>The Webmention 
specification is being developed under the <a 
href=\"https://www.w3.org/wiki/Socialwg\";>W3C Social Web Working 
Group</a>.</li>\n    <li class=\"main\"><a 
href=\"https://www.w3.org/TR/webmention/\";>Latest published version</a></li>\n  
  <li class=\"main\"><a href=\"http://webmention.net/draft/\";>Latest editor's 
draft</a></li>\n    <li class=\"main\"><a 
href=\"http://webmention.net/implementations/\";>Implementations</a></li>\n    
<li>The specification was contributed to the W3C by the IndieWeb community. 
More information and history of the spec can be found on the <a 
href=\"https://indieweb.org/webmention\";>IndieWeb wiki</a>.</li>\n  </ul>\n  
\n</div>\n\n</body>\n</html>"

For other sites, especially ones where the pages are larger, things are
broken.  For example, let's try to pull down the site of friend Joey
Hess:

scheme@(guile-user)> (http-get (string->uri "https://joeyh.name/";))
$9 = #<<response> version: (1 . 1) code: 200 reason-phrase: "OK" headers: 
((date . #<date nanosecond: 0 second: 58 minute: 15 hour: 18 day: 6 month: 11 
year: 2016 zone-offset: 0>) (server . "Apache/2.4.10 (Debian)") (last-modified 
. #<date nanosecond: 0 second: 43 minute: 34 hour: 14 day: 24 month: 10 year: 
2016 zone-offset: 0>) (etag "195c-53f9d4af683f3" . #t) (accept-ranges bytes) 
(content-length . 6492) (vary accept-encoding) (cache-control (max-age . 0)) 
(expires . #<date nanosecond: 0 second: 58 minute: 15 hour: 18 day: 6 month: 11 
year: 2016 zone-offset: 0>) (connection close) (content-type text/html)) port: 
#<closed: file f4c070>>
$10 = "moz-background-size: cover;\n    -o-background-size: cover;\n    
background-size: cover;\n}\n.sidebar {\n    background: none;\n    border: 
none;\n}\ninput#searchbox {\n    display: none;\n}\n#pageinfo {\n    display: 
none;\n}\n.pageheader .actions ul {\n    border-bottom: none;\n}\n#pagebody {\n 
   margin-left: 20%;\n}\n.archivepagedate {\n    font-size: 0.5em;\n}\n.actions 
{\n    display: none;\n}\n</style>\n</div>\n\n\n\n\n<table>\n<tr>\n<td 
width=\"33%\" valign=top><h3>personal</h3>\n\n<p><a 
href=\"./blog/\">blog</a><br/>\n<a href=\"./pics/\">pics</a><br/>\n<a 
href=\"./contact/\">contact me</a><br/>\n<a 
href=\"./todo/\">todo</a></p>\n\n\n\n</td>\n<td width=\"33%\" 
valign=top><h3>technical</h3>\n\n<p><a href=\"./code/\">code</a><br/>\n<a 
href=\"./vcshome/\">vcshome</a><br/>\n<a href=\"./talks/\">talks</a><br/>\n<a 
href=\"./screencasts/\">screencasts</a><br/>\n<a 
href=\"./termcast/\">termcasts</a><br/>\n<a href=\"./rfc/\">rfcs</a><br/>\n<a 
href=\"./boxen/\">boxen</a></p>\n\n\n\n</td>\n<td width=\"33%\" 
valign=top><h3>fun</h3>\n\n<p><a href=\"./learnstofly/\">Joey Learns to 
Fly</a><br/>\n<a href=\"http://olduse.net/\";>oldusenet</a><br/>\n<a 
href=\"./languages/\">languages</a><br/>\n<a href=\"./yurt/\">yurt</a><br/>\n<a 
href=\"./caving/\">caving</a><br/>\n<a href=\"./grep/\">grep</a><br/>\n<a 
href=\"./meta/\">meta</a></p>\n\n\n\n</td>\n</tr>\n</table>\n\n\n\n\n<h3>interviews</h3>\n\n<p><a
 href=\"http://joey.hess.usesthis.com\";>2012: The 
Setup</a></p>\n\n<blockquote><p>\"When power is low, I often hack in the 
evenings by lantern light.\"</p></blockquote>\n\n<p><a 
href=\"http://zgrimshell.github.io/posts/interviews-with-floss-developers-joey-hess.html\";>2015:
 Life after Debian</a></p>\n\n<blockquote><p>\"I want to build worthwhile 
things that might last.\"</p></blockquote>\n\n<p><a 
href=\"http://lwn.net/Articles/672352/\";>2016: Linux Weekly 
News</a></p>\n\n<blockquote><p>\"I still see myself as a beginner, and 
certainly not an 
exemplar.\"</p></blockquote>\n\n\n\n\n\n\n</section>\n\n\n\n\n\n\n\n</div>\n\n<footer
 id=\"footer\" class=\"pagefooter\" role=\"contentinfo\">\n\n<nav 
id=\"pageinfo\">\n\n\n\n\n\n\n\n\n\n\n\n<div class=\"pagedate\">\nLast edited 
<time datetime=\"2015-03-02T15:14:09Z\" class=\"relativedate\" title=\"Mon, 02 
Mar 2015 10:14:09 -0500\">mid-morning Monday, March  2nd, 2015</time>\n<!-- 
Created <time datetime=\"2006-03-19T23:58:19Z\" pubdate=\"pubdate\" 
class=\"relativedate\" title=\"Sun, 19 Mar 2006 18:58:19 -0500\">Sunday 
evening, March 19th, 2006</time> -->\n</div>\n\n</nav>\n\n\n<!-- from joey 
-->\n</footer>\n\n</article>\n\n</body>\n</html>\n\" title=\"Thu, 22 Sep 2016 
16:13:21 -0400\">at teatime on Thursday, September 22nd, 
2016</time>\n\n</span>\n</div>\n<div class=\"archivepage\">\n\n<a 
href=\"./blog/entry/PoW_bucket_bloom/\">PoW bucket bloom: throttling anonymous 
clients with proof of work, token buckets, and bloom filters</a><br />\n\n<span 
class=\"archivepagedate\">\nPosted <time datetime=\"2016-09-13T05:14:47Z\" 
pubdate=\"pubdate\" class=\"relativedate\" title=\"Tue, 13 Sep 2016 01:14:47 
-0400\">late Monday night, September 13th, 2016</time>\n\n</span>\n</div>\n<div 
class=\"archivepage\">\n\n<a href=\"./blog/entry/late_summer/\">late 
summer</a><br />\n\n<span class=\"archivepagedate\">\nPosted <time 
datetime=\"2016-08-31T01:15:40Z\" pubdate=\"pubdate\" class=\"relativedate\" 
title=\"Tue, 30 Aug 2016 21:15:40 -0400\">late Tuesday evening, August 30th, 
2016</time>\n\n</span>\n</div>\n\n\n\n\n\n</aside>\n\n\n\n<div 
id=\"pagebody\">\n\n<section id=\"content\" 
role=\"main\">\n\n\n<div>\n<style>\nhtml { \n    background: url(joeykite.jpg) 
no-repeat center center fixed; \n    -webkit-background-size: cover;\n    
-\x00r\x00\x00\x00e\x00\x00\x00t\x00\x00\x00u\x00\x00\x00r\x00\x00\x00n\x00\x00\x00e\x00\x00\x00d\x00\x00\x00
 \x00\x00\x00a\x00\x00\x00s\x00\x00\x00 \x00\x00\x00a\x00\x00\x00 
\x00\x00\x00b\x00\x00\x00y\x00\x00\x00t\x00\x00\x00e\x00\x00\x00v\x00\x00\x00e\x00\x00\x00c\x00\x00\x00t\x00\x00\x00o\x00\x00\x00r\x00\x00\x00.\x00\x00\x00\n\x00\x00\x00\n\x00\x00\x00H\x00\x00\x00o\x00\x00\x00w\x00\x00\x00e\x00\x00\x00v\x00\x00\x00e\x00\x00\x00r\x00\x00\x00,\x00\x00\x00
 \x00\x00\x00i\x00\x00\x00f\x00\x00\x00 
\x00\x00\x00S\x00\x00\x00T\x00\x00\x00R\x00\x00\x00E\x00\x00\x00A\x00\x00\x00M\x00\x00\x00I\x00\x00\x00N\x00\x00\x00G\x00\x00\x00?\x00\x00\x00
 \x00\x00\x00i\x00\x00\x00s\x00\x00\x00 
\x00\x00\x00t\x00\x00\x00r\x00\x00\x00u\x00\x00\x00e\x00\x00\x00,\x00\x00\x00 
\x00\x00\x00i\x00\x00\x00n\x00\x00\x00s\x00\x00\x00t\x00\x00\x00e\x00\x00\x00a\x00\x00\x00d\x00\x00\x00
 \x00\x00\x00o\x00\x00\x00f\x00\x00\x00 
\x00\x00\x00e\x00\x00\x00a\x00\x00\x00g\x00\x00\x00e\x00\x00\x00r\x00\x00\x00l\x00\x00\x00y\x00\x00\x00
 
\x00\x00\x00r\x00\x00\x00e\x00\x00\x00a\x00\x00\x00d\x00\x00\x00i\x00\x00\x00n\x00\x00\x00g\x00\x00\x00
 \x00\x00\x00t\x00\x00\x00h\x00\x00\x00e\x00\x00\x00 
\x00\x00\x00r\x00\x00\x00e\x00\x00\x00s\x00\x00\x00p\x00\x00\x00o\x00\x00\x00n\x00\x00\x00s\x00\x00\x00e\x00\x00\x00\n\x00\x00\x00b\x00\x00\x00o\x00\x00\x00d\x00\x00\x00y\x00\x00\x00
 \x00\x00\x00f\x00\x00\x00r\x00\x00\x00o\x00\x00\x00m\x00\x00\x00 
\x00\x00\x00t\x00\x00\x00h\x00\x00\x00e\x00\x00\x00 
\x00\x00\x00s\x00\x00\x00e\x00\x00\x00r\x00\x00\x00v\x00\x00\x00e\x00\x00\x00r\x00\x00\x00,\x00\x00\x00
 \x00\x00\x00t\x00\x00\x00h\x00\x00\x00i\x00\x00\x00s\x00\x00"

(truncated a bit)

First of all, the response body starts in the wrong place... it should
start with "<!DOCTYPE html>".  Then, somewhere in the middle it switches
to garbage output.  I'm not sure why.

Again, it's fine before being wrapped in the custom-binary-input/output-port.

So either it's my fault (could well be) or there's a bug in the
custom-binary-input/output-port implementation.  I feel like I don't
know enough to be sure.  I would assume it's on my end, but since I
think this is the first major use of that interface, a bug seems hardly
impossible.

Anyway, to test this bug you'll need to have gnutls compiled with a
newer Guile.  I've attached the hacky guix package I'm using to test
this.  Then you'll want to do:

  $ guix environment --ad-hoc gnutls-with-guile-next guile-next

(You need guile-next even if doing gnutls-with-guile-next in the
environment to enable the Guile 2.2 paths.)

I could use some help on this... I'm afraid that if I've done something
wrong, I'm not knowledgeable enough to know how to get out of the problem.

>From 91c0a4a728ca4bf2e9468cdc849c350dd3f7380f Mon Sep 17 00:00:00 2001
From: Christopher Allan Webber <address@hidden>
Date: Thu, 17 Sep 2015 15:14:54 -0500
Subject: [PATCH] web: Add https support through gnutls.

Since importing gnutls directly would result in a dependency cycle,
we load gnutls lazily.

This uses code originally written for Guix by Ludovic

* module/web/client.scm: (%http-receive-buffer-size)
  (warn-no-gnutls-return-false, gnutls-module, ensure-gnutls)
  (gnutls-ref, tls-wrap): New variables.
  (open-socket-for-uri): Wrap in tls when uri scheme is https.
* doc/ref/web.texi (open-socket-for-uri): Document gnutls usage.
---
 doc/ref/web.texi      |   6 +-
 module/web/client.scm | 175 +++++++++++++++++++++++++++++++++++++++++++-------
 2 files changed, 158 insertions(+), 23 deletions(-)

diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index becdc28..c2f3f61 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -1422,7 +1422,11 @@ the lower-level HTTP, request, and response modules.
 @end example
 
 @deffn {Scheme Procedure} open-socket-for-uri uri
-Return an open input/output port for a connection to URI.
+Return an open input/output port for a connection to URI.  Guile
+dynamically loads gnutls for https support; for more information, see
address@hidden Preparations,
+how to install the GnuTLS bindings for Guile,, gnutls-guile,
+GnuTLS-Guile}.
 @end deffn
 
 @deffn {Scheme Procedure} http-get uri arg...
diff --git a/module/web/client.scm b/module/web/client.scm
index f24a4d7..f0fba49 100644
--- a/module/web/client.scm
+++ b/module/web/client.scm
@@ -1,6 +1,6 @@
 ;;; Web client
 
-;; Copyright (C) 2011, 2012, 2013, 2014 Free Software Foundation, Inc.
+;; Copyright (C) 2011, 2012, 2013, 2014, 2015, 2016 Free Software Foundation, 
Inc.
 
 ;; This library is free software; you can redistribute it and/or
 ;; modify it under the terms of the GNU Lesser General Public
@@ -43,8 +43,11 @@
   #:use-module (srfi srfi-1)
   #:use-module (srfi srfi-9)
   #:use-module (srfi srfi-9 gnu)
+  #:use-module ((rnrs io ports)
+                #:prefix rnrs-ports:)
   #:export (current-http-proxy
             open-socket-for-uri
+            open-connection-for-uri
             http-get
             http-get*
             http-head
@@ -54,11 +57,113 @@
             http-trace
             http-options))
 
+(define %http-receive-buffer-size
+  ;; Size of the HTTP receive buffer.
+  65536)
+
+;; Provide access to the gnutls-module, but fail gracefully if not available.
+;; Why take this route and not just straight up import the module?
+;; Guile can't depend on gnutls because gnutls includes Guile as a dependency.
+;; There's some risk of dependency cycles, so lazily resolving things only
+;; once needed helps!
+
+(define warn-no-gnutls-return-false
+  (lambda _
+    (format (current-error-port)
+            "warning: (gnutls) module not available\n")
+    #f))
+
+(define gnutls-module
+  (delay
+    (catch 'misc-error
+      (lambda ()
+        (let ((module (resolve-interface '(gnutls))))
+          ;; In some 2.1/2.2 installations installed alongside Guile 2.0, 
gnutls
+          ;; can be imported but the bindings are broken as "unknown type".
+          ;; Here we check that gnutls-version is the right type (a procedure)
+          ;; to make sure the bindings are ok.
+          (if (procedure? (module-ref module 'gnutls-version))
+              module
+              (warn-no-gnutls-return-false))))
+      warn-no-gnutls-return-false)))
+
+(define (ensure-gnutls)
+  (if (not (force gnutls-module))
+      (throw 'gnutls-not-available "(gnutls) module not available")))
+
+(define (gnutls-ref symbol)
+  "Fetch method-symbol from the gnutls module"
+  (module-ref (force gnutls-module) symbol))
+
 (define current-http-proxy
   (make-parameter (let ((proxy (getenv "http_proxy")))
                     (and (not (equal? proxy ""))
                          proxy))))
 
+(define (tls-wrap port server)
+  "Return PORT wrapped in a TLS connection to SERVER.  SERVER must be a DNS
+host name without trailing dot."
+  (define (log level str)
+    (format (current-error-port)
+            "gnutls: [~a|~a] ~a" (getpid) level str))
+
+  (ensure-gnutls)
+
+  (let ((session ((gnutls-ref 'make-session)
+                  (gnutls-ref 'connection-end/client))))
+
+    ;; Some servers such as 'cloud.github.com' require the client to support
+    ;; the 'SERVER NAME' extension.  However, 'set-session-server-name!' is
+    ;; not available in older GnuTLS releases.  See
+    ;; <http://bugs.gnu.org/18526> for details.
+    (if (module-defined? (force gnutls-module)
+                         'set-session-server-name!)
+        ((gnutls-ref 'set-session-server-name!)
+         session (gnutls-ref 'server-name-type/dns) server)
+        (format (current-error-port)
+                "warning: TLS 'SERVER NAME' extension not supported~%"))
+
+    ((gnutls-ref 'set-session-transport-fd!) session (fileno port))
+    ((gnutls-ref 'set-session-default-priority!) session)
+
+    ;; The "%COMPAT" bit allows us to work around firewall issues (info
+    ;; "(gnutls) Priority Strings"); see <http://bugs.gnu.org/23311>.
+    ;; Explicitly disable SSLv3, which is insecure:
+    ;; <https://tools.ietf.org/html/rfc7568>.
+    ((gnutls-ref 'set-session-priorities!) session 
"NORMAL:%COMPAT:-VERS-SSL3.0")
+
+    ((gnutls-ref 'set-session-credentials!) session
+     ((gnutls-ref 'make-certificate-credentials)))
+
+    ;; Uncomment the following lines in case of debugging emergency.
+    ;;(set-log-level! 10)
+    ;;(set-log-procedure! log)
+
+    ((gnutls-ref 'handshake) session)
+    (let ((record ((gnutls-ref 'session-record-port) session)))
+      (define (read! bv start count)
+        (define read-bv (get-bytevector-n record count))
+        (if (eof-object? read-bv)
+            0  ; read! returns 0 on eof-object
+            (let ((read-bv-len (bytevector-length read-bv)))
+              (bytevector-copy! read-bv 0 bv 0 read-bv-len)
+              read-bv-len)))
+      (define (write! bv start count)
+        (put-bytevector record bv start count)
+        count)
+      (define (get-position)
+        (rnrs-ports:port-position record))
+      (define (set-position! new-position)
+        (rnrs-ports:set-port-position! record new-position))
+      (define (close)
+        (unless (port-closed? port)
+          (close-port port))
+        (unless (port-closed? record)
+          (close-port record)))
+      (make-custom-binary-input/output-port "gnutls wrapped port" read! write!
+                                            get-position set-position!
+                                            close))))
+
 (define (ensure-uri uri-or-string)
   (cond
    ((string? uri-or-string) (string->uri uri-or-string))
@@ -81,27 +186,53 @@
                         0))
        (lambda (ai1 ai2)
          (equal? (addrinfo:addr ai1) (addrinfo:addr ai2))))))
-
-  (let loop ((addresses addresses))
-    (let* ((ai (car addresses))
-           (s  (with-fluids ((%default-port-encoding #f))
-                 ;; Restrict ourselves to TCP.
-                 (socket (addrinfo:fam ai) SOCK_STREAM IPPROTO_IP))))
-      (catch 'system-error
-        (lambda ()
-          (connect s (addrinfo:addr ai))
-
-          ;; Buffer input and output on this port.
-          (setvbuf s 'block)
-          ;; If we're using a proxy, make a note of that.
-          (when http-proxy (set-http-proxy-port?! s #t))
-          s)
-        (lambda args
-          ;; Connection failed, so try one of the other addresses.
-          (close s)
-          (if (null? (cdr addresses))
-              (apply throw args)
-              (loop (cdr addresses))))))))
+  (define https?
+    (eq? 'https (uri-scheme uri)))
+  (define (open-socket)
+    (let loop ((addresses addresses))
+      (let* ((ai (car addresses))
+             (s  (with-fluids ((%default-port-encoding #f))
+                   ;; Restrict ourselves to TCP.
+                   (socket (addrinfo:fam ai) SOCK_STREAM IPPROTO_IP))))
+        (catch 'system-error
+          (lambda ()
+            (connect s (addrinfo:addr ai))
+
+            ;; Buffer input and output on this port.
+            (setvbuf s 'block)
+            ;; If we're using a proxy, make a note of that.
+            (when http-proxy (set-http-proxy-port?! s #t))
+            s)
+          (lambda args
+            ;; Connection failed, so try one of the other addresses.
+            (close s)
+            (if (null? (cdr addresses))
+                (apply throw args)
+                (loop (cdr addresses))))))))
+
+  (let-syntax ((with-https-proxy
+                (syntax-rules ()
+                  ((_ exp)
+                   ;; For HTTPS URIs, honor 'https_proxy', not 'http_proxy'.
+                   ;; FIXME: Proxying is not supported for https.
+                   (let ((thunk (lambda () exp)))
+                     (if (and https?
+                              current-http-proxy)
+                         (parameterize ((current-http-proxy #f))
+                           (when (and=> (getenv "https_proxy")
+                                        (negate string-null?))
+                             (format (current-error-port)
+                                     "warning: 'https_proxy' is ignored~%"))
+                           (thunk))
+                         (thunk)))))))
+    (with-https-proxy
+     (let ((s (open-socket)))
+       ;; Buffer input and output on this port.
+       (setvbuf s _IOFBF %http-receive-buffer-size)
+
+       (if https?
+           (tls-wrap s (uri-host uri))
+           s)))))
 
 (define (extend-request r k v . additional)
   (let ((r (set-field r (request-headers)
-- 
2.10.2


reply via email to

[Prev in Thread] Current Thread [Next in Thread]