emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Resending email in Gnus, figuring out charset


From: Adam Sjøgren
Subject: Re: Resending email in Gnus, figuring out charset
Date: Wed, 31 Oct 2018 20:43:44 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

Eli writes:

>> The Content-Transfer-Encoding: 8bit header means "raw bytes in the
>> body", and the Content-Type: text/plain; charset=utf-8 explains how
>> those bytes should be interpreted, right?
>
> These headers tell the receiving end how to interpret the message.

Yes. So as I received this email, Gnus should be interpreting the bytes
at utf-8. And it seems to be, as they are displayed correctly.

> But I meant something different: what you have in the Gnus buffer
> _before_ the message is sent.

Before I resend the message, the buffer looks correct (i.e. I see the
the arrow and the accented e rather than \nnn\nnn\nnn etc.)

>> When I look at the feedbase-email in Gnus, it is displayed as expected,
>> but when I try to resend it, for some reason Gnus can't guess what the
>> encoding should be.
>
> That's a sign of raw bytes in the buffer.
>
> If you go to one of the offending characters in the Gnus buffer and
> type "C-u C-x =", what does Emacs show about those characters?

Ok, if I open the feedbase-email in Gnus, before I press S D r to
resend, and move point to → and é in the *Article* buffer, I get:

               position: 530 of 684 (77%), column: 1
              character: → (displayed as →) (codepoint 8594, #o20622, #x2192)
      preferred charset: unicode (Unicode (ISO10646))
  code point in charset: 0x2192
                 script: symbol
                 syntax: .      which means: punctuation
               category: .:Base, c:Chinese, h:Korean, j:Japanese
               to input: type "C-x 8 RET 2192" or "C-x 8 RET RIGHTWARDS ARROW"
            buffer code: #xE2 #x86 #x92
              file code: #xE2 #x86 #x92 (encoded by coding system utf-8-unix)
                display: by this font (glyph code)
      xft:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 
(#x7AE)

  Character code properties: customize what to show
    name: RIGHTWARDS ARROW
    old-name: RIGHT ARROW
    general-category: Sm (Symbol, Math)
    decomposition: (8594) ('→')

and:

               position: 284 of 684 (41%), column: 6
              character: é (displayed as é) (codepoint 233, #o351, #xe9)
      preferred charset: unicode (Unicode (ISO10646))
  code point in charset: 0xE9
                 script: latin
                 syntax: w      which means: word
               category: .:Base, L:Left-to-right (strong), c:Chinese, 
j:Japanese, l:Latin, v:Viet
               to input: type "C-x 8 RET e9" or "C-x 8 RET LATIN SMALL LETTER E 
WITH ACUTE"
            buffer code: #xC3 #xA9
              file code: #xC3 #xA9 (encoded by coding system utf-8-unix)
                display: by this font (glyph code)
      xft:-PfEd-DejaVu Sans Mono-normal-normal-normal-*-20-*-*-*-m-0-iso10646-1 
(#xAB)

  Character code properties: customize what to show
    name: LATIN SMALL LETTER E WITH ACUTE
    old-name: LATIN SMALL LETTER E ACUTE
    general-category: Ll (Letter, Lowercase)
    decomposition: (101 769) ('e' '́')


  Best regards,

    Adam

-- 
 "God must've been punting angels left and right."            Adam Sjøgren
                                                         address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]