[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Multibyte and unibyte file names
From: |
Michael Albinus |
Subject: |
Re: Multibyte and unibyte file names |
Date: |
Wed, 23 Jan 2013 20:42:55 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux) |
Eli Zaretskii <address@hidden> writes:
> 2) This gets worse with remote file names. For these, the handlers
> are always called first, and the result is never run through
> dostounix_filename. However, Tramp sometimes turns around and
> calls the "real" handler on parts of the remote file name,
> evidently expecting that "real" handler not to do any harm. But
> due to the above, it does do harm. While it might be justified to
> limit native file name support to file names encodable with the
> current file-name-coding-system, it _cannot_ be justified for
> remote file names. An example of this is file-name-directory:
>
> (defun tramp-handle-file-name-directory (file)
> "Like `file-name-directory' but aware of Tramp files."
> ;; Everything except the last filename thing is the directory. We
> ;; cannot apply `with-parsed-tramp-file-name', because this expands
> ;; the remote file name parts. This is a problem when we are in
> ;; file name completion.
> (let ((v (tramp-dissect-file-name file t)))
> ;; Run the command on the localname portion only.
> (tramp-make-tramp-file-name
> (tramp-file-name-method v)
> (tramp-file-name-user v)
> (tramp-file-name-host v)
> (tramp-run-real-handler
> 'file-name-directory (list (or (tramp-file-name-localname v) ""))))))
>
> which on Windows means that, e.g.
>
> (let ((file-name-coding-system 'cp1252))
> (file-name-directory "/address@hidden:漢字/"))
>
> => "/address@hidden: /"
>
> And there are other similar handlers in Tramp (e.g., the
> file-name-nondirectory handler) which do the same. IOW, they seem
> to _assume_ that the corresponding "real" handler never needs to
> encode the file name. A false assumption.
Tramp is not prepared to handle encoded file names. One of the first
actions on the remote side is to set the environment "LC_ALL=C". An
exception are Android devices, which require UTF-8.
I agree, Tramp shall check carefully what a file name encoding is. This
must be added to the code.
There might be a chance to switch to en_US.UTF-8 on the remote side. But
even here I would propose to start with the unibyte subset. "en_US",
because Tramp parses the output of commands, which must not be
localized.
Other encodings but UTF-8 will be hard to support. It is not only that
Tramp calls "native" file name primitives, there are also several
parsing routines for commands on the remote side, which have their
expectations on file name syntax and their encodings.
> TIA
Best regards, Michael.
- Multibyte and unibyte file names, Eli Zaretskii, 2013/01/23
- Re: Multibyte and unibyte file names, Stefan Monnier, 2013/01/23
- Re: Multibyte and unibyte file names, Eli Zaretskii, 2013/01/24
- Re: Multibyte and unibyte file names, Stefan Monnier, 2013/01/24
- Re: Multibyte and unibyte file names, Eli Zaretskii, 2013/01/24
- Re: Multibyte and unibyte file names, Stefan Monnier, 2013/01/24
- Re: Multibyte and unibyte file names, Eli Zaretskii, 2013/01/25
- Re: Multibyte and unibyte file names, Stefan Monnier, 2013/01/25