[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gnu-arch-users] Increasing the filename space (Or: begging for trouble?
From: |
Martin Thorsen Ranang |
Subject: |
[Gnu-arch-users] Increasing the filename space (Or: begging for trouble?) |
Date: |
03 Feb 2004 03:15:43 +0100 |
Hi. :-)
After several hours of trying to add the Norwegian letters
LATIN SMALL LETTER AE: 'æ'
LATIN SMALL LETTER O WITH STROKE: 'ø'
LATIN SMALL LETTER A WITH RING ABOVE: 'å'
and (capitals)
LATIN CAPITAL LETTER AE: 'Æ'
LATIN CAPITAL LETTER O WITH STROKE: 'Ø'
LATIN CAPITAL LETTER A WITH RING ABOVE: 'Å'
to the list of accepted filenames by changing the file =tagging-method
so that the source regexp reads
source ^[_=a-zA-ZæøåÆØÅ0-9].*$
I have some thoughts I would like to share.
I've been thinking about the purpose of =tagging-methods, and it seems
to me (based on [among other sources] the tutorial and the reference
manual) that the exclude/junk/precious/backup/unrecognized/source
regexps provide a means to explicitly and implicitly define those
categories.
I've also studied some of the code in tla that handles files and it
seems that at least portions of hackerlab seems very Unicode-aware,
while the module/file char/char-class.[hc] is strictly ASCII-based and
explicitly states that is will not consider any locale settings.
Now, my problem seems to be located in the function
contains_illegal_character (char *filename) in the file
tla/libarch/invent.c. Here I've included my suggested modification of
that function:
static int
contains_illegal_character (char * filename)
{
int x;
for (x = 0; filename[x]; ++x)
if ((filename[x] == '*')
|| (filename[x] == '?')
|| (filename[x] == '[')
|| (filename[x] == ']')
|| (filename[x] == '\\')
|| (filename[x] == ' ')
|| (filename[x] == '\t')
/* Suggested removal:
|| (!char_is_printable (filename[x]))
*/
)
return 1;
return 0;
}
Now, I can see that the author of that function doesn't want any
"non-printable" characters into the inventory. But, based on
1) Tom Lord's statement that "the file system is, after all, a form of
database. File names are a primary key for that database.
Limiting the space of reasonably usable keys is lame."
... to which I agree and
2) The filename would (probably) not even have been there in the first
place if it wasn't because somebody actually needed it or at least
could see it (i.e. it's printable).
and
3) If my assumption about the intension of the =tagging-method regexps
is right, then _those_regexps_ should be the controlling variables,
_not_ a statically compiled and very restrictive set of characters.
I wonder: could you accept the modification suggested above? I
suppose that if you don't, the thing to do would be to add Unicode or
locale-aware filename handling, but this could certainly help a lot in
the meantime.
Yours sincerely,
Martin Thorsen Ranang
- [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?),
Martin Thorsen Ranang <=
- Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?), Christian Thäter, 2004/02/03
- Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?), Tom Lord, 2004/02/03
- Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?), Christian Thäter, 2004/02/03
- Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?), Tom Lord, 2004/02/03
- Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?), Christian Thäter, 2004/02/03
- Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?), Robert Collins, 2004/02/03
- Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?), Tom Lord, 2004/02/03
- Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?), Christian Thäter, 2004/02/03
- Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?), Tom Lord, 2004/02/03
- [Gnu-arch-users] spaces in filenames, first iteration, Christian Thäter, 2004/02/07