emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: revert-buffer and changes in encoding


From: Kenichi Handa
Subject: Re: revert-buffer and changes in encoding
Date: Wed, 5 Jan 2005 10:18:23 +0900 (JST)

In article <address@hidden>, Stefan <address@hidden> writes:

>>>  Is that right?  Wouldn't that get automatically-chosen coding systems
>>>  as well as explicit user-specified coding systems?

>>  Yes.  But, whatever coding system is used for writing a
>>  file, revert-buffer should read the file with the same
>>  coding system.

> But then the name you chose is wrong.  If the name is "foobar-explicit" it
> should only be non-nil if foobar was set explicitly, not automatically.

Ah, ummm, I agree that the name is not good.  The current
semantics of that variable is "what Emacs thinks as the
encoding of disk file" and the semantics of
buffer-file-coding-system is "what Emacs will use by default
for writing out the buffer".

It seems that just file-coding-system is better than
buffer-file-coding-system-explicit.

> Knowing when buffer-file-coding-system was set explicitly is important also
> in select-safe-coding-system (where it should not try any other cs, including
> the preferred cs).

I think what important in select-safe-coding-system is
whether buffer-file-coding-system has local binding or not
(while treating undecided-unix as no local binding in text
encoding and has local binding in eol encoding).  If it has
local binding, select-safe-coding-system should not try any
other encoding.  With that change, I think
select-safe-coding-system behaves correctly in any cases.

You wrote:
> If I open a new file, insert é and then do the following:

>    C-x RET f us-ascii RET
>    C-x C-s

> the file is saved in latin-1.  This is because when saving
> buffer-file-coding-system is just one of several coding-systems used.

> Another annoying situation is when you load a utf-8 file containing mostly
> latin-1 chars plus a few non-latin-1 chars.  Let's say you don't know that
> there are non-latin-1 chars and want to change the file to latin-1.  You do:

>    C-x RET f latin-1 RET
>    C-x C-s

> and the buffer and file is back to utf-8 !?!

In both cases, with the above change,
select-safe-coding-system will ask you what coding system to
use while showing offending chars.

> Another problem I've encountered (recently with the iso-2022-7bit ->
> utf-8 -> iso-2022-7bit dance in mule-cmds.el) is that iso-2022-7bit cannot
> encode eight-bit-control characters, so if you read an iso-2022-7bit file
> with invalid sequences in it, you get a buffer that you can't save.
> Worse yet, when you try to save it it might say "selected encoding
> mule-utf-8 disagrees with iso-2022-7bit-unix specified by file contents" but
> if you look at the buffer's modeline it says "J", not "u", so you wonder
> what's up with this utf-8 thing.

Whether we should allow saving such a file by iso-2022-7bit
silently or not is another problem.  If offending characters
are only raw-bytes, how about this:

Show in *Warning* buffer:

As the buffer contains 8-bit characters, if you save it by
iso-2022-7bit, the file won't be read back correctly by the
same coding system.

And ask a user:

Do you really want to save it by iso-2022-7bit (y or n)?

---
Ken'ichi HANDA
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]