help-emacs-windows
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h-e-w] Determining coding system for text files


From: Eli Zaretskii
Subject: Re: [h-e-w] Determining coding system for text files
Date: Mon, 31 Oct 2011 22:36:49 +0200

> Date: Mon, 31 Oct 2011 15:28:22 -0400
> From: Eric Roode <address@hidden>
> 
> I would like new buffers to default to utf-8 encoding, and I would like
> indeterminate files (like text files, especially source code files) also to
> use utf-8, unless the -*- line specifies a different coding system.

What for?  What you ask for doesn't make sense without some
explanation.  It is meaningless to say that pure ASCII files should
have UTF-8 encoding, because UTF-8 is indistinguishable from ASCII
when all the characters are 7-bit ASCII.

> By default, when I create a new buffer that isn't associated with any file,
> the coding system is set to 'iso-latin1-dos'. When I visit an existing
> (text) file, its coding system is set to 'undecided-dos'.
> 
> I tried to change this by executing
>        (prefer-coding-system 'utf-8)
> After that, when I create a new file, the coding system in the new buffer
> is set to 'utf-8'.

As expected.  I presume this accomplishes part of what you wanted.

> However, when I open an existing file, emacs still sets its coding
> system to 'undecided-dos'.

If the file includes only 7-bit ASCII, this is also expected
behavior.  Please explain why you aren't happy with this.

> Digging further, it seems that this is controlled by the variable
> file-coding-system-alist.  If a file name does not match any of the
> patterns in that list, the function find-buffer-file-type-coding-system (in
> dos-w32.el) is invoked to determine what coding system to use for the file.
> 
> That function *always* returns 'undecided' for text files, or
> 'no-conversion' for files it determines are binary.  The only time it uses
> the default value for buffer-file-coding-system is if the file doesn't yet
> exist!

That is how Emacs behaves on all platforms, even on Unix.  The default
value of buffer-file-coding-system is used only for non-existing files
or for buffers not related to files.  When a buffer visits an existing
file, Emacs always sets its encoding to match the encoding of the
file.  prefer-coding-system just tells Emacs which encoding to prefer
when more than one can match the encoding of an existing file.

> Am I reading this right?  There is no way to set a preferred coding system
> for existing files under Windows?

There _is_ a way, but it doesn't do what you expect.  Please explain
why your expectations are different, and in particular what is wrong
with the current behavior in your use cases.

> 'prefer-coding-system' only works in *nix environments?

No, it works the same on all platforms.  The Windows implementation
has a few subtle points, but it doesn't change the basic behavior.

> I have to either add every source and text file name
> pattern to file-coding-system-alist, or manually change the buffer coding
> every time I visit an existing file?

You don't and you shouldn't.  It is meaningless to change buffer
coding after you visit a file, except when you save the file.  After
telling Emacs to prefer UTF-8, as you did, whenever Emacs needs to
encode a file when you save it, it will use UTF-8 if possible, and if
not, it will ask you for a different encoding.  Again, if this is not
what you want, please explain why.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]