[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM
From: |
R. Diez |
Subject: |
bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM |
Date: |
Sun, 9 May 2021 23:38:18 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1 |
I think that hexl-mode has problems with the UTF-8 BOM byte sequence at the
beginning of a text file. The steps to reproduce this issue are:
Create a text file with a single line with 3 characters: 123
Do a (set-buffer-file-coding-system 'utf-8-with-signature-dos) and save the
file.
The file should now have the following contents (8 bytes):
ef bb bf 31 32 33 0d 0a
That is the UTF-8 BOM (ef bb bf), the ASCII digits 1, 2 and 3, and end-of-line
sequence (CR LF).
Now change to hexl-mode, place the cursor at the '1' character (31 in hex), call hexl-insert-hex-char, and enter 00 in order to replace the '1' with a
binary zero (NUL character).
The result is puzzling. Instead of replacing the '1' (31) with NUL (00), the UTF-8 BOM is duplicated, the characters '1' and '2' and '3' have been
overwritten with the new copy of BOM, character CR has been replaced with NUL, and character LF is intact:
ef bb bf ef bb bf 00 0a
If you save, close and reload the file, it gains one byte, but that is probably
not important, just a consequence of having lost the CR character:
ef bb bf ef bb bf 00 0d 0a
- bug#48321: 27.2; Text copied from *grep* buffer has NUL (0x00) characters, R. Diez, 2021/05/09
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM,
R. Diez <=
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, Eli Zaretskii, 2021/05/10
- Message not available
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, Eli Zaretskii, 2021/05/10
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, Lars Ingebrigtsen, 2021/05/10
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, Andreas Schwab, 2021/05/10
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, Eli Zaretskii, 2021/05/10
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, R. Diez, 2021/05/10
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, Eli Zaretskii, 2021/05/10
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, Andreas Schwab, 2021/05/10
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, Eli Zaretskii, 2021/05/11
- bug#48324: 27.2; hexl-mode duplicates the UTF-8 BOM, Glenn Morris, 2021/05/11