bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#71080: 30.0.50; UTF-8 used unconditionally when saving GPG file


From: Eli Zaretskii
Subject: bug#71080: 30.0.50; UTF-8 used unconditionally when saving GPG file
Date: Mon, 20 May 2024 19:20:55 +0300

> Cc: monnier@iro.umontreal.ca
> Date: Mon, 20 May 2024 11:43:31 -0400
> From:  Stefan Monnier via "Bug reports for GNU Emacs,
>  the Swiss army knife of text editors" <bug-gnu-emacs@gnu.org>
> 
> Then
> 
>     emacs -Q ~/tmp/foo.txt
>     C-x C-w foo.gpg RET        # To save the file into an encrypted `foo.gpg`.
>     TAB TAB RET                # To select symmetric encryption.
>     .. type the password you'd like to use ...
>     M-x revert-buffer RET
> 
> and then you should see that he `λ` turned into its UTF-8 sequence `\316\273`.
> The same happens if you encrypt with public keys and if you use any
> other encoding that's different from UTF-8.
> 
> AFAICT, the problem is partly due to
> 
>     (find-coding-systems-region (point-min) (point-max))
> 
> returning a list which includes `no-conversion` because in the end the
> buffer is saved with "no conversion" (i.e. it uses Emacs's internal
> encoding).  Another part of the problem is that `find-auto-coding`
> returns `no-conversion` for `.gpg` files because those files are binary.
> 
> IOW it can be considered as the result of "no conversion" being
> ambiguous, meaning either "binary" or "Emacs's internal encoding"
> depending on the circumstances.  But it's also due to the confusion
> between the encoding to use before encryption (resp. after decryption)
> and the encoding to use after encryption (resp. before decryption).
> 
> I don't understand enough of how the "no conversion" ambiguity is
> expected to be resolved, nor how the different layers of encoding
> are supposed to be handled in file-name-handlers to dig much deeper.

How can this work reliably, unless the *.gpg files can have some
meta-data that tells Emacs how to decode them?  When encoding, we
could perhaps use buffer-file-coding-system (AFAICT, we do that
indirectly now, via select-safe-coding-system), but what to do when
decoding?

If _you_ know the correct encoding, you could use "C-x RET c" before
the commands (as in "C-x RET c iso-2022-7bit RET C-x C-w").  Did you
try that?

IOW, I don't think the problem is with 'no-conversion', the problem is
that when decoding, we don't really have any useful info for how to
decode, and the locale-dependent ad-hoc'ery doesn't help because
GPG-encrypted stuff is likely to come from a different locale.  You
deliberately used iso-2022-7bit, which simulates such an "alien" file.

Am I missing something (I know very little about epa, so apologies if
what I say makes no sense)?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]