[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Single unrecognized character wrecks entire display
From: |
Peter Dyballa |
Subject: |
Re: Single unrecognized character wrecks entire display |
Date: |
Fri, 24 Aug 2012 17:01:49 +0200 |
Am 24.08.2012 um 15:46 schrieb Alexandre Oberlin:
> iconv acts just the same. It tells me the 13th character is faulty (\351),
> while only the 40th is (\234)
Alexandre,
I think you're making here the same mistake as I did before! \351 is not the
number of the character in the Unicode encoding but an UTF-8 byte. The UTF
encodings are multi-byte encodings and therefore there cannot be that byte \351
stands for character \351 (or 233 decimal or E9 hexadecimal). Iconv and GNU
Emacs obviously find some single isolated bytes are spread into the text. This
could also explain the different counting: characters vs. bytes (13th vs. 40th).
Could you try a native MS Losedos GNU Emacs?
Could you send me privately such a translation output before GNU Emacs or iconv
have changed anything? Can it be that this output is not plain text but some
structured format containing these odd bytes you mentioned initially which
might switch font or emphasising or tell where a paragraph ends or a footnote
starts?
--
Greetings
Pete
"By filing this bug report you have challenged the honor of my family. Prepare
to die!"
Re: Single unrecognized character wrecks entire display, Stefan Monnier, 2012/08/22