bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#4157: 23.1.50; faulty character characterisation for ä


From: Peter Dyballa
Subject: bug#4157: 23.1.50; faulty character characterisation for ä
Date: Sun, 23 Aug 2009 11:57:37 +0200


Am 23.08.2009 um 03:49 schrieb Stefan Monnier:

In both locales the *file names* are correct and also detected as containing

"correct" doesn't really tell me what you see, but I see what you mean.

"Correct" meant that I was seeing what I had typed before in Finder...


"composed characters," it's a problem with the file's month date. In the

So my guess was right: ls's output uses utf-8 for the filenames, but
latin-1 for the date, which is why it's difficult for dired to do the
right thing (it's not impossible, of course, but it's more work and
dired is currently not setup for that).


Here is a little test from a shell (actually *shell* buffer in NS Emacs.app with UTF-8 locales):

pete 252 /\ gls -lN zo*
-rw-r--r-- 1 pete admin 281829 20. Mär 1998  zoä€.au
pete 253 /\ ls -lw zo*
-rw-r--r--   1 pete  admin  281829 20 Mär  1998 zoä€.au
pete 254 /\ gls -lN zo* | od -j 32 -t a
0000040 0 . sp M \303 \244 r sp 1 9 9 8 sp sp z o
0000060    a   \314  88   \342  82   \254   .   a   u  nl
0000072
pete 255 /\ env LC_CTYPE=de_DE.ISO8859-15 LANG=de_DE.ISO8859-15 gls - lN zo* | od -j 32 -t a 0000040 0 . sp M \344 r sp 1 9 9 8 sp sp z o a
0000060    \314  88   \342  82   \254   .   a   u  nl
0000071
pete 256 /\ ls -lw zo* | od -j 32 -t a
0000040 2 9 sp 2 0 sp M \303 \244 r sp 1 9 9 8
0000060   sp   z   o   a   \314  88   \342  82   \254   .   a   u  nl
0000075
pete 257 /\ env LC_CTYPE=de_DE.ISO8859-15 LANG=de_DE.ISO8859-15 ls - lw zo* | od -j 32 -t a 0000040 2 9 sp 2 0 sp M \344 r sp sp 1 9 9 8 sp
0000060    z   o   a   \314  88   \342  82   \254   .   a   u  nl
0000074

So the *ls commands deliver the month date in their locale composed while the file name is always *de*composed UTF-8:

\303 \244 = C3 A4 = LATIN SMALL LETTER A WITH DIAERESIS ä at U +00E4 \314 88 = CC 88 = COMBINING DIAERESIS ¨ at U +0308 \342 82 \254 = E2 88 AC = EURO SIGN € at U +20AC

--
Greetings

  Pete

Bake pizza not war!








reply via email to

[Prev in Thread] Current Thread [Next in Thread]