bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] [PATCH] Add a few useful aliases


From: Bruno Haible
Subject: Re: [bug-gnu-libiconv] [PATCH] Add a few useful aliases
Date: Tue, 8 Apr 2008 11:26:16 +0200
User-agent: KMail/1.5.4

Daniel Richard G. wrote:
> Do you honestly want your version of iconv(1) to be the odd man out that 
> requires such a workaround?

This is exactly the argument used in chain letters.

> If I were writing a program that identifies and returns an encoding, then 
> yes, the output should be in the canonical form. But here, you are saying 
> that you care more about making users change their way of working to the 
> tool, rather than the tool to the user. That the principle of "be lenient 
> in what you accept, and strict in what you produce" holds no water with 
> you.

Yes. Usually - when there are few implementations who need to deal with the
topic - I agree with "be lenient in what you accept". But here, there are
so many programs to deal with. If I would say yes for libiconv, then you or
someone else would make the same request for Mozilla, then another one for
mutt, then another one for Samba. Etc. etc. and 20 years from now programmers
will still be asked to add new aliases!

> If the key words 
> "multiple other implementations" aren't a good enough reason for GNU 
> libiconv-iconv(1) to recognize these aliases

They are not. When there are multiple implementations of a thing, and a
standard, then the standard should matter.

> Oh, if only that iconv(1) didn't buffer its entire input in memory, thereby 
> rendering it useless for large files....)

Eeek, you are right. This is weird. glibc iconv reads all input into memory
before processing it. The source code has a comment

#ifdef _POSIX_MAPPED_FILES
            /* We have possibilities for reading the input file.  First try
               to mmap() it since this will provide the fastest solution.  */

but it is not even the fastest:

With iconv from libiconv, which reads the file piecemeal:
$ time iconv -f ISO-8859-1 -t UCS-2 < some-100mb-file > /dev/null 

real    1m5.545s
user    0m52.919s
sys     0m6.642s

With glibc iconv:
$ time /usr/bin/iconv -f ISO-8859-1 -t UCS-2 < some-100mb-file > /dev/null 

real    1m11.679s
user    0m8.046s
sys     0m2.472s

And look at 'top' while it's processing...

Can you please report it in the glibc bug tracker
http://sourceware.org/bugzilla/ ?

Bruno





reply via email to

[Prev in Thread] Current Thread [Next in Thread]