[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects
From: |
Ken Hornstein |
Subject: |
Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects |
Date: |
Wed, 18 Jun 2014 09:22:28 -0400 |
>> That's not universally true anymore. Some newer filesystems are
>> mandating that filenames are UTF-8 and enforcing normalization rules
>> (MacOS X and Solaris are two notable examples).
>
>Thanks, I didn't know. Haven't used Solaris in years, and never bought
>Apple.
Let me amend this a bit; as I understand it, you have to enable that
behavior on Solaris. It's the default behavior on MacOS X.
>> Solaris is better; the original bytes are preserved, but lookup is
>> done using normalized names so you can't have two filenames with the
>> same characters.
>
>What about globbing, especially on Mac OS X? Given your two examples on
>Linux with bash,
>[...]
So, clearly we need some userspace support. AFAIK, the globbing isn't
Unicode-aware; it's just matching on whatever readdir() returns. Should
a ? match on a byte? A Unicode codepoint? An abstract character? I am
not sure, and I am not sure if anyone has decided on this from a standards
point of view.
>Do you think NFKC would be better, so ? often matches what appears as a
>single rune and fi matches ligature fi?
Hm. I believe some network filesystems use NFKC, but I am neutral on
what should be done. Should fi match fi? I cannot decide; I see
arguments for both.
--Ken
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, (continued)
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, norm, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Jerrad Pierce, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Earl Hood, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ken Hornstein, 2014/06/17
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects, Ralph Corderoy, 2014/06/18
- Re: [Nmh-workers] Non-ASCII Characters in bodies and subjects,
Ken Hornstein <=