[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Why does dired go through extra efforts to avoid unibyte names
From: |
Eli Zaretskii |
Subject: |
Re: Why does dired go through extra efforts to avoid unibyte names |
Date: |
Fri, 29 Dec 2017 21:17:29 +0200 |
> From: Stefan Monnier <address@hidden>
> Date: Fri, 29 Dec 2017 09:34:53 -0500
> Cc: address@hidden
>
> I bumped into the following code in dired-get-filename:
>
> ;; The above `read' will return a unibyte string if FILE
> ;; contains eight-bit-control/graphic characters.
> (if (and enable-multibyte-characters
> (not (multibyte-string-p file)))
> (setq file (string-to-multibyte file)))
>
> and I'm wondering why we don't want a unibyte string here.
> `vc-region-history` told me this comes from the commit appended below,
> which seems to indicate that we're worried about a subsequent encoding,
> but AFAIK unibyte file names are not (re)encoded, and passing them
> through string-to-multibyte would actually make things worse in this
> respect (since it might cause the kind of (re)encoding this is
> supposedly trying to avoid).
>
> What am I missing?
Why does it matter whether eight-bit-* characters are encoded one more
or one less time?
As for the reason for using string-to-multibyte: maybe it's because we
use concat further down in the function, which will determine whether
the result will be unibyte or multibyte according to its own ideas of
what's TRT?