Re: Email text that confuses charset recognition in emacs

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Email text that confuses charset recognition in emacs

From:	Paul Eggert
Subject:	Re: Email text that confuses charset recognition in emacs
Date:	Tue, 16 Apr 2013 21:37:08 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5

On 04/16/2013 09:27 AM, Giorgos Keramidas wrote:
> the attached email message confuses the charset
> detection machinery of Emacs, and it starts interpreting all text as
> Japanese text -- even though most of the contents of the file are plain
> us-ascii text.

Although the text is US-ASCII it contains a valid ISO-2022-7bit
coding sequence (the two things are not incompatible)
which Emacs is properly detecting and converting.  The problem is that
the text later contains the invalid escape sequence

   ESC LF > > SP ( B

This text was intended to switch out of a Japanese charset (the immediately
preceding text is valid ISO-2022-7bit Japanese), but a mailer that
*thought* that the text was ASCII inserted LF > > SP after the ESC
and before the ( B, causing the ESC ( B to be corrupted, so Emacs remains
in Japanese mode until the end of the input.

Perhaps when Emacs is decoding ISO-2022-7bit and sees an invalid
escape sequence, it should switch back to ASCII.  That would have
fixed your problem, and wouldn't break the decoding of any valid
ISO-2022-7bit sequence.

[Prev in Thread]

Current Thread

[Next in Thread]

Email text that confuses charset recognition in emacs, Giorgos Keramidas, 2013/04/16
- Re: Email text that confuses charset recognition in emacs, Paul Eggert <=
  - Re: Email text that confuses charset recognition in emacs, Kenichi Handa, 2013/04/24
    - Re: Email text that confuses charset recognition in emacs, Giorgos Keramidas, 2013/04/24

Prev by Date: Re: MS-Windows build using Posix configury
Next by Date: Re: Emacs Mac port
Previous by thread: Email text that confuses charset recognition in emacs
Next by thread: Re: Email text that confuses charset recognition in emacs
Index(es):
- Date
- Thread