[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#24075: tls/https support in Guile (through r6rs binary ports?)
From: |
Christopher Allan Webber |
Subject: |
bug#24075: tls/https support in Guile (through r6rs binary ports?) |
Date: |
Sun, 06 Nov 2016 12:32:38 -0600 |
User-agent: |
mu4e 0.9.16; emacs 25.1.1 |
Some less good news: I found out that the https stuff is not working
right for all sites. I tested though... the code works *before* I
wrapped it in custom-binary-input/output-port.
After being wrapped though, strange things happen. For some sites (eg
"https://webmention.net/") things seem fine:
scheme@(guile-user)> (http-get (string->uri "https://webmention.net/"))
$7 = #<<response> version: (1 . 1) code: 200 reason-phrase: "OK" headers:
((server . "nginx/1.9.10") (date . #<date nanosecond: 0 second: 46 minute: 15
hour: 18 day: 6 month: 11 year: 2016 zone-offset: 0>) (content-type text/html
(charset . "UTF-8")) (transfer-encoding (chunked)) (connection close)
(x-powered-by . "PHP/5.6.21-1+donate.sury.org~trusty+4")) port: #<closed: file
f25310>>
$8 = "<!DOCTYPE html>\n<html>\n<head>\n <title>Webmention</title>\n <link
rel=\"stylesheet\" href=\"/styles.css\">\n</head>\n<body>\n\n<div
class=\"page\">\n \n <h1>Webmention</h1>\n \n <div
class=\"subtitle\">Webmention is a simple way to notify any URL when you link
to it from your site.</div>\n \n <ul class=\"links\">\n <li>The Webmention
specification is being developed under the <a
href=\"https://www.w3.org/wiki/Socialwg\">W3C Social Web Working
Group</a>.</li>\n <li class=\"main\"><a
href=\"https://www.w3.org/TR/webmention/\">Latest published version</a></li>\n
<li class=\"main\"><a href=\"http://webmention.net/draft/\">Latest editor's
draft</a></li>\n <li class=\"main\"><a
href=\"http://webmention.net/implementations/\">Implementations</a></li>\n
<li>The specification was contributed to the W3C by the IndieWeb community.
More information and history of the spec can be found on the <a
href=\"https://indieweb.org/webmention\">IndieWeb wiki</a>.</li>\n </ul>\n
\n</div>\n\n</body>\n</html>"
For other sites, especially ones where the pages are larger, things are
broken. For example, let's try to pull down the site of friend Joey
Hess:
scheme@(guile-user)> (http-get (string->uri "https://joeyh.name/"))
$9 = #<<response> version: (1 . 1) code: 200 reason-phrase: "OK" headers:
((date . #<date nanosecond: 0 second: 58 minute: 15 hour: 18 day: 6 month: 11
year: 2016 zone-offset: 0>) (server . "Apache/2.4.10 (Debian)") (last-modified
. #<date nanosecond: 0 second: 43 minute: 34 hour: 14 day: 24 month: 10 year:
2016 zone-offset: 0>) (etag "195c-53f9d4af683f3" . #t) (accept-ranges bytes)
(content-length . 6492) (vary accept-encoding) (cache-control (max-age . 0))
(expires . #<date nanosecond: 0 second: 58 minute: 15 hour: 18 day: 6 month: 11
year: 2016 zone-offset: 0>) (connection close) (content-type text/html)) port:
#<closed: file f4c070>>
$10 = "moz-background-size: cover;\n -o-background-size: cover;\n
background-size: cover;\n}\n.sidebar {\n background: none;\n border:
none;\n}\ninput#searchbox {\n display: none;\n}\n#pageinfo {\n display:
none;\n}\n.pageheader .actions ul {\n border-bottom: none;\n}\n#pagebody {\n
margin-left: 20%;\n}\n.archivepagedate {\n font-size: 0.5em;\n}\n.actions
{\n display: none;\n}\n</style>\n</div>\n\n\n\n\n<table>\n<tr>\n<td
width=\"33%\" valign=top><h3>personal</h3>\n\n<p><a
href=\"./blog/\">blog</a><br/>\n<a href=\"./pics/\">pics</a><br/>\n<a
href=\"./contact/\">contact me</a><br/>\n<a
href=\"./todo/\">todo</a></p>\n\n\n\n</td>\n<td width=\"33%\"
valign=top><h3>technical</h3>\n\n<p><a href=\"./code/\">code</a><br/>\n<a
href=\"./vcshome/\">vcshome</a><br/>\n<a href=\"./talks/\">talks</a><br/>\n<a
href=\"./screencasts/\">screencasts</a><br/>\n<a
href=\"./termcast/\">termcasts</a><br/>\n<a href=\"./rfc/\">rfcs</a><br/>\n<a
href=\"./boxen/\">boxen</a></p>\n\n\n\n</td>\n<td width=\"33%\"
valign=top><h3>fun</h3>\n\n<p><a href=\"./learnstofly/\">Joey Learns to
Fly</a><br/>\n<a href=\"http://olduse.net/\">oldusenet</a><br/>\n<a
href=\"./languages/\">languages</a><br/>\n<a href=\"./yurt/\">yurt</a><br/>\n<a
href=\"./caving/\">caving</a><br/>\n<a href=\"./grep/\">grep</a><br/>\n<a
href=\"./meta/\">meta</a></p>\n\n\n\n</td>\n</tr>\n</table>\n\n\n\n\n<h3>interviews</h3>\n\n<p><a
href=\"http://joey.hess.usesthis.com\">2012: The
Setup</a></p>\n\n<blockquote><p>\"When power is low, I often hack in the
evenings by lantern light.\"</p></blockquote>\n\n<p><a
href=\"http://zgrimshell.github.io/posts/interviews-with-floss-developers-joey-hess.html\">2015:
Life after Debian</a></p>\n\n<blockquote><p>\"I want to build worthwhile
things that might last.\"</p></blockquote>\n\n<p><a
href=\"http://lwn.net/Articles/672352/\">2016: Linux Weekly
News</a></p>\n\n<blockquote><p>\"I still see myself as a beginner, and
certainly not an
exemplar.\"</p></blockquote>\n\n\n\n\n\n\n</section>\n\n\n\n\n\n\n\n</div>\n\n<footer
id=\"footer\" class=\"pagefooter\" role=\"contentinfo\">\n\n<nav
id=\"pageinfo\">\n\n\n\n\n\n\n\n\n\n\n\n<div class=\"pagedate\">\nLast edited
<time datetime=\"2015-03-02T15:14:09Z\" class=\"relativedate\" title=\"Mon, 02
Mar 2015 10:14:09 -0500\">mid-morning Monday, March 2nd, 2015</time>\n<!--
Created <time datetime=\"2006-03-19T23:58:19Z\" pubdate=\"pubdate\"
class=\"relativedate\" title=\"Sun, 19 Mar 2006 18:58:19 -0500\">Sunday
evening, March 19th, 2006</time> -->\n</div>\n\n</nav>\n\n\n<!-- from joey
-->\n</footer>\n\n</article>\n\n</body>\n</html>\n\" title=\"Thu, 22 Sep 2016
16:13:21 -0400\">at teatime on Thursday, September 22nd,
2016</time>\n\n</span>\n</div>\n<div class=\"archivepage\">\n\n<a
href=\"./blog/entry/PoW_bucket_bloom/\">PoW bucket bloom: throttling anonymous
clients with proof of work, token buckets, and bloom filters</a><br />\n\n<span
class=\"archivepagedate\">\nPosted <time datetime=\"2016-09-13T05:14:47Z\"
pubdate=\"pubdate\" class=\"relativedate\" title=\"Tue, 13 Sep 2016 01:14:47
-0400\">late Monday night, September 13th, 2016</time>\n\n</span>\n</div>\n<div
class=\"archivepage\">\n\n<a href=\"./blog/entry/late_summer/\">late
summer</a><br />\n\n<span class=\"archivepagedate\">\nPosted <time
datetime=\"2016-08-31T01:15:40Z\" pubdate=\"pubdate\" class=\"relativedate\"
title=\"Tue, 30 Aug 2016 21:15:40 -0400\">late Tuesday evening, August 30th,
2016</time>\n\n</span>\n</div>\n\n\n\n\n\n</aside>\n\n\n\n<div
id=\"pagebody\">\n\n<section id=\"content\"
role=\"main\">\n\n\n<div>\n<style>\nhtml { \n background: url(joeykite.jpg)
no-repeat center center fixed; \n -webkit-background-size: cover;\n
-\x00r\x00\x00\x00e\x00\x00\x00t\x00\x00\x00u\x00\x00\x00r\x00\x00\x00n\x00\x00\x00e\x00\x00\x00d\x00\x00\x00
\x00\x00\x00a\x00\x00\x00s\x00\x00\x00 \x00\x00\x00a\x00\x00\x00
\x00\x00\x00b\x00\x00\x00y\x00\x00\x00t\x00\x00\x00e\x00\x00\x00v\x00\x00\x00e\x00\x00\x00c\x00\x00\x00t\x00\x00\x00o\x00\x00\x00r\x00\x00\x00.\x00\x00\x00\n\x00\x00\x00\n\x00\x00\x00H\x00\x00\x00o\x00\x00\x00w\x00\x00\x00e\x00\x00\x00v\x00\x00\x00e\x00\x00\x00r\x00\x00\x00,\x00\x00\x00
\x00\x00\x00i\x00\x00\x00f\x00\x00\x00
\x00\x00\x00S\x00\x00\x00T\x00\x00\x00R\x00\x00\x00E\x00\x00\x00A\x00\x00\x00M\x00\x00\x00I\x00\x00\x00N\x00\x00\x00G\x00\x00\x00?\x00\x00\x00
\x00\x00\x00i\x00\x00\x00s\x00\x00\x00
\x00\x00\x00t\x00\x00\x00r\x00\x00\x00u\x00\x00\x00e\x00\x00\x00,\x00\x00\x00
\x00\x00\x00i\x00\x00\x00n\x00\x00\x00s\x00\x00\x00t\x00\x00\x00e\x00\x00\x00a\x00\x00\x00d\x00\x00\x00
\x00\x00\x00o\x00\x00\x00f\x00\x00\x00
\x00\x00\x00e\x00\x00\x00a\x00\x00\x00g\x00\x00\x00e\x00\x00\x00r\x00\x00\x00l\x00\x00\x00y\x00\x00\x00
\x00\x00\x00r\x00\x00\x00e\x00\x00\x00a\x00\x00\x00d\x00\x00\x00i\x00\x00\x00n\x00\x00\x00g\x00\x00\x00
\x00\x00\x00t\x00\x00\x00h\x00\x00\x00e\x00\x00\x00
\x00\x00\x00r\x00\x00\x00e\x00\x00\x00s\x00\x00\x00p\x00\x00\x00o\x00\x00\x00n\x00\x00\x00s\x00\x00\x00e\x00\x00\x00\n\x00\x00\x00b\x00\x00\x00o\x00\x00\x00d\x00\x00\x00y\x00\x00\x00
\x00\x00\x00f\x00\x00\x00r\x00\x00\x00o\x00\x00\x00m\x00\x00\x00
\x00\x00\x00t\x00\x00\x00h\x00\x00\x00e\x00\x00\x00
\x00\x00\x00s\x00\x00\x00e\x00\x00\x00r\x00\x00\x00v\x00\x00\x00e\x00\x00\x00r\x00\x00\x00,\x00\x00\x00
\x00\x00\x00t\x00\x00\x00h\x00\x00\x00i\x00\x00\x00s\x00\x00"
(truncated a bit)
First of all, the response body starts in the wrong place... it should
start with "<!DOCTYPE html>". Then, somewhere in the middle it switches
to garbage output. I'm not sure why.
Again, it's fine before being wrapped in the custom-binary-input/output-port.
So either it's my fault (could well be) or there's a bug in the
custom-binary-input/output-port implementation. I feel like I don't
know enough to be sure. I would assume it's on my end, but since I
think this is the first major use of that interface, a bug seems hardly
impossible.
Anyway, to test this bug you'll need to have gnutls compiled with a
newer Guile. I've attached the hacky guix package I'm using to test
this. Then you'll want to do:
$ guix environment --ad-hoc gnutls-with-guile-next guile-next
(You need guile-next even if doing gnutls-with-guile-next in the
environment to enable the Guile 2.2 paths.)
I could use some help on this... I'm afraid that if I've done something
wrong, I'm not knowledgeable enough to know how to get out of the problem.
>From 91c0a4a728ca4bf2e9468cdc849c350dd3f7380f Mon Sep 17 00:00:00 2001
From: Christopher Allan Webber <address@hidden>
Date: Thu, 17 Sep 2015 15:14:54 -0500
Subject: [PATCH] web: Add https support through gnutls.
Since importing gnutls directly would result in a dependency cycle,
we load gnutls lazily.
This uses code originally written for Guix by Ludovic
* module/web/client.scm: (%http-receive-buffer-size)
(warn-no-gnutls-return-false, gnutls-module, ensure-gnutls)
(gnutls-ref, tls-wrap): New variables.
(open-socket-for-uri): Wrap in tls when uri scheme is https.
* doc/ref/web.texi (open-socket-for-uri): Document gnutls usage.
---
doc/ref/web.texi | 6 +-
module/web/client.scm | 175 +++++++++++++++++++++++++++++++++++++++++++-------
2 files changed, 158 insertions(+), 23 deletions(-)
diff --git a/doc/ref/web.texi b/doc/ref/web.texi
index becdc28..c2f3f61 100644
--- a/doc/ref/web.texi
+++ b/doc/ref/web.texi
@@ -1422,7 +1422,11 @@ the lower-level HTTP, request, and response modules.
@end example
@deffn {Scheme Procedure} open-socket-for-uri uri
-Return an open input/output port for a connection to URI.
+Return an open input/output port for a connection to URI. Guile
+dynamically loads gnutls for https support; for more information, see
address@hidden Preparations,
+how to install the GnuTLS bindings for Guile,, gnutls-guile,
+GnuTLS-Guile}.
@end deffn
@deffn {Scheme Procedure} http-get uri arg...
diff --git a/module/web/client.scm b/module/web/client.scm
index f24a4d7..f0fba49 100644
--- a/module/web/client.scm
+++ b/module/web/client.scm
@@ -1,6 +1,6 @@
;;; Web client
-;; Copyright (C) 2011, 2012, 2013, 2014 Free Software Foundation, Inc.
+;; Copyright (C) 2011, 2012, 2013, 2014, 2015, 2016 Free Software Foundation,
Inc.
;; This library is free software; you can redistribute it and/or
;; modify it under the terms of the GNU Lesser General Public
@@ -43,8 +43,11 @@
#:use-module (srfi srfi-1)
#:use-module (srfi srfi-9)
#:use-module (srfi srfi-9 gnu)
+ #:use-module ((rnrs io ports)
+ #:prefix rnrs-ports:)
#:export (current-http-proxy
open-socket-for-uri
+ open-connection-for-uri
http-get
http-get*
http-head
@@ -54,11 +57,113 @@
http-trace
http-options))
+(define %http-receive-buffer-size
+ ;; Size of the HTTP receive buffer.
+ 65536)
+
+;; Provide access to the gnutls-module, but fail gracefully if not available.
+;; Why take this route and not just straight up import the module?
+;; Guile can't depend on gnutls because gnutls includes Guile as a dependency.
+;; There's some risk of dependency cycles, so lazily resolving things only
+;; once needed helps!
+
+(define warn-no-gnutls-return-false
+ (lambda _
+ (format (current-error-port)
+ "warning: (gnutls) module not available\n")
+ #f))
+
+(define gnutls-module
+ (delay
+ (catch 'misc-error
+ (lambda ()
+ (let ((module (resolve-interface '(gnutls))))
+ ;; In some 2.1/2.2 installations installed alongside Guile 2.0,
gnutls
+ ;; can be imported but the bindings are broken as "unknown type".
+ ;; Here we check that gnutls-version is the right type (a procedure)
+ ;; to make sure the bindings are ok.
+ (if (procedure? (module-ref module 'gnutls-version))
+ module
+ (warn-no-gnutls-return-false))))
+ warn-no-gnutls-return-false)))
+
+(define (ensure-gnutls)
+ (if (not (force gnutls-module))
+ (throw 'gnutls-not-available "(gnutls) module not available")))
+
+(define (gnutls-ref symbol)
+ "Fetch method-symbol from the gnutls module"
+ (module-ref (force gnutls-module) symbol))
+
(define current-http-proxy
(make-parameter (let ((proxy (getenv "http_proxy")))
(and (not (equal? proxy ""))
proxy))))
+(define (tls-wrap port server)
+ "Return PORT wrapped in a TLS connection to SERVER. SERVER must be a DNS
+host name without trailing dot."
+ (define (log level str)
+ (format (current-error-port)
+ "gnutls: [~a|~a] ~a" (getpid) level str))
+
+ (ensure-gnutls)
+
+ (let ((session ((gnutls-ref 'make-session)
+ (gnutls-ref 'connection-end/client))))
+
+ ;; Some servers such as 'cloud.github.com' require the client to support
+ ;; the 'SERVER NAME' extension. However, 'set-session-server-name!' is
+ ;; not available in older GnuTLS releases. See
+ ;; <http://bugs.gnu.org/18526> for details.
+ (if (module-defined? (force gnutls-module)
+ 'set-session-server-name!)
+ ((gnutls-ref 'set-session-server-name!)
+ session (gnutls-ref 'server-name-type/dns) server)
+ (format (current-error-port)
+ "warning: TLS 'SERVER NAME' extension not supported~%"))
+
+ ((gnutls-ref 'set-session-transport-fd!) session (fileno port))
+ ((gnutls-ref 'set-session-default-priority!) session)
+
+ ;; The "%COMPAT" bit allows us to work around firewall issues (info
+ ;; "(gnutls) Priority Strings"); see <http://bugs.gnu.org/23311>.
+ ;; Explicitly disable SSLv3, which is insecure:
+ ;; <https://tools.ietf.org/html/rfc7568>.
+ ((gnutls-ref 'set-session-priorities!) session
"NORMAL:%COMPAT:-VERS-SSL3.0")
+
+ ((gnutls-ref 'set-session-credentials!) session
+ ((gnutls-ref 'make-certificate-credentials)))
+
+ ;; Uncomment the following lines in case of debugging emergency.
+ ;;(set-log-level! 10)
+ ;;(set-log-procedure! log)
+
+ ((gnutls-ref 'handshake) session)
+ (let ((record ((gnutls-ref 'session-record-port) session)))
+ (define (read! bv start count)
+ (define read-bv (get-bytevector-n record count))
+ (if (eof-object? read-bv)
+ 0 ; read! returns 0 on eof-object
+ (let ((read-bv-len (bytevector-length read-bv)))
+ (bytevector-copy! read-bv 0 bv 0 read-bv-len)
+ read-bv-len)))
+ (define (write! bv start count)
+ (put-bytevector record bv start count)
+ count)
+ (define (get-position)
+ (rnrs-ports:port-position record))
+ (define (set-position! new-position)
+ (rnrs-ports:set-port-position! record new-position))
+ (define (close)
+ (unless (port-closed? port)
+ (close-port port))
+ (unless (port-closed? record)
+ (close-port record)))
+ (make-custom-binary-input/output-port "gnutls wrapped port" read! write!
+ get-position set-position!
+ close))))
+
(define (ensure-uri uri-or-string)
(cond
((string? uri-or-string) (string->uri uri-or-string))
@@ -81,27 +186,53 @@
0))
(lambda (ai1 ai2)
(equal? (addrinfo:addr ai1) (addrinfo:addr ai2))))))
-
- (let loop ((addresses addresses))
- (let* ((ai (car addresses))
- (s (with-fluids ((%default-port-encoding #f))
- ;; Restrict ourselves to TCP.
- (socket (addrinfo:fam ai) SOCK_STREAM IPPROTO_IP))))
- (catch 'system-error
- (lambda ()
- (connect s (addrinfo:addr ai))
-
- ;; Buffer input and output on this port.
- (setvbuf s 'block)
- ;; If we're using a proxy, make a note of that.
- (when http-proxy (set-http-proxy-port?! s #t))
- s)
- (lambda args
- ;; Connection failed, so try one of the other addresses.
- (close s)
- (if (null? (cdr addresses))
- (apply throw args)
- (loop (cdr addresses))))))))
+ (define https?
+ (eq? 'https (uri-scheme uri)))
+ (define (open-socket)
+ (let loop ((addresses addresses))
+ (let* ((ai (car addresses))
+ (s (with-fluids ((%default-port-encoding #f))
+ ;; Restrict ourselves to TCP.
+ (socket (addrinfo:fam ai) SOCK_STREAM IPPROTO_IP))))
+ (catch 'system-error
+ (lambda ()
+ (connect s (addrinfo:addr ai))
+
+ ;; Buffer input and output on this port.
+ (setvbuf s 'block)
+ ;; If we're using a proxy, make a note of that.
+ (when http-proxy (set-http-proxy-port?! s #t))
+ s)
+ (lambda args
+ ;; Connection failed, so try one of the other addresses.
+ (close s)
+ (if (null? (cdr addresses))
+ (apply throw args)
+ (loop (cdr addresses))))))))
+
+ (let-syntax ((with-https-proxy
+ (syntax-rules ()
+ ((_ exp)
+ ;; For HTTPS URIs, honor 'https_proxy', not 'http_proxy'.
+ ;; FIXME: Proxying is not supported for https.
+ (let ((thunk (lambda () exp)))
+ (if (and https?
+ current-http-proxy)
+ (parameterize ((current-http-proxy #f))
+ (when (and=> (getenv "https_proxy")
+ (negate string-null?))
+ (format (current-error-port)
+ "warning: 'https_proxy' is ignored~%"))
+ (thunk))
+ (thunk)))))))
+ (with-https-proxy
+ (let ((s (open-socket)))
+ ;; Buffer input and output on this port.
+ (setvbuf s _IOFBF %http-receive-buffer-size)
+
+ (if https?
+ (tls-wrap s (uri-host uri))
+ s)))))
(define (extend-request r k v . additional)
(let ((r (set-field r (request-headers)
--
2.10.2