[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM
From: |
Eli Zaretskii |
Subject: |
bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM |
Date: |
Sun, 03 Jul 2022 16:00:47 +0300 |
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: rgm@gnu.org, schwab@linux-m68k.org, 48324@debbugs.gnu.org
> Date: Sun, 03 Jul 2022 14:07:43 +0200
>
> Lars Ingebrigtsen <larsi@gnus.org> writes:
>
> > Hm... I guess the only reliable solution across all coding systems is
> > (like your comment in the code says) to drop the encode-every-char and
> > try encoding strings, and then see whether the result is short enough.
> > That could be done somewhat efficiently using a binary search. I'll
> > have a go at it...
>
> And while I was at it, I changed it to return complete glyphs, not just
> complete code points.
>
> There's a behavioural change, though. This:
>
> (string-limit "foóá" 6 t 'utf-16)
>
> Now returns a string with a BOM, whereas previously it didn't.
So you get 6 characters + the BOM?