bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#52670: legacy base64 encoding of latin-1


From: mattiase
Subject: bug#52670: legacy base64 encoding of latin-1
Date: Sun, 19 Dec 2021 22:47:15 +0100

For what appears to be historical reasons, the base64 encoding functions 
(base64-encode-string etc) treat characters in the range U+0080..U+00FF as if 
they were raw bytes in the 127..255 range. This means that

  (base64-encode-string "ÿ")

and

  (base64-encode-string "\xff")

return the same result although the strings are completely different. Attempts 
to encode other multibyte characters fail (correctly). For example,

  (base64-encode-string "Ÿ")

signals an error, as expected.

I propose we tighten up the behavior by eliminating the legacy handling of 
characters in the  U+0080..U+00FF range. Letting the bug stay in place enables 
incorrect, brittle and error-prone usage: the functions are clearly intended to 
be fed encoded text only and should signal an error when not, as stated in the 
manual.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]