bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request


From: Eli Zaretskii
Subject: bug#24117: 25.1; url-http-create-request: Multibyte text in HTTP request
Date: Tue, 02 Aug 2016 18:25:37 +0300

> Cc: 24117@debbugs.gnu.org
> From: Dmitry Gutov <dgutov@yandex.ru>
> Date: Tue, 2 Aug 2016 03:52:25 +0300
> 
> (length (concat (encode-coding-string "фыва" 'utf-8) 
> (string-as-multibyte "abc")))
> 
> => 11
> 
> (string-bytes (concat (encode-coding-string "фыва" 'utf-8) 
> (string-as-multibyte "abc")))
> 
> => 19
> 
> And
> 
> (multibyte-string-p (url-host (url-generic-parse-url "http://127.0.0.1";)))
> 
> => t
> 
> Apparently, url-generic-parse-url creates a multibyte string for the 
> host name because it performs its parsing in a buffer. And 
> url-http-create-request uses the return value of (url-host 
> url-http-target-url) to set the Location header. And all of that gets 
> concatenated in the request.

Thanks for spelling this out.

> Some possible solutions:
> 
> - Perform the "string-bytes = length" verification only for 
> url-http-data, not the the whole request string. This strikes me as 
> ugly, but apparently we've been living with using a multibyte string 
> here for a while.
> 
> - Call url-encode-url on the return value of (url-host 
> url-http-target-url), and hope that no similar problem pops up with any 
> of the related variables. This does solve the immediate problem with 
> anaconda-mode, I've checked.
> 
> - Something else?

How about making the temporary buffer parsed by url-generic-parse-url
a unibyte buffer?  Does that fix the problem?  AFAIU, RFC 3986 doesn't
allow non-ASCII characters, so we should be okay handling that in a
unibyte buffer, right?  I mean something like this:

    (with-temp-buffer
      ;; Don't let those temp-buffer modifications accidentally
      ;; deactivate the mark of the current-buffer.
      (let ((deactivate-mark nil))
        (set-syntax-table url-parse-syntax-table)
        (erase-buffer)
        (set-buffer-multibyte nil)   ;; <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
        (insert url)
        (goto-char (point-min))
        ...

As for other possible problems like that, are there any that could be
expected already?  If so, we could try fixing them now.
Alternatively, we could just wait for them to come up; after all,
catching those was the main rationale for introducing the length test,
right?

Thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]