[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#52670: legacy base64 encoding of latin-1
From: |
mattiase |
Subject: |
bug#52670: legacy base64 encoding of latin-1 |
Date: |
Sun, 19 Dec 2021 22:47:15 +0100 |
For what appears to be historical reasons, the base64 encoding functions
(base64-encode-string etc) treat characters in the range U+0080..U+00FF as if
they were raw bytes in the 127..255 range. This means that
(base64-encode-string "ÿ")
and
(base64-encode-string "\xff")
return the same result although the strings are completely different. Attempts
to encode other multibyte characters fail (correctly). For example,
(base64-encode-string "Ÿ")
signals an error, as expected.
I propose we tighten up the behavior by eliminating the legacy handling of
characters in the U+0080..U+00FF range. Letting the bug stay in place enables
incorrect, brittle and error-prone usage: the functions are clearly intended to
be fed encoded text only and should signal an error when not, as stated in the
manual.
- bug#52670: legacy base64 encoding of latin-1,
mattiase <=