bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] [PATCH] Add a few useful aliases


From: Daniel Richard G.
Subject: Re: [bug-gnu-libiconv] [PATCH] Add a few useful aliases
Date: Mon, 7 Apr 2008 21:02:41 -0400

On Tue, 2008 Apr 08 00:54:39 +0200, Bruno Haible wrote:
> > I am concerned here with iconv-the-program, not iconv-the-function-call. Is 
> > there a way to add aliases that are visible only in the former case?
> 
> You can write a wrapper script around the iconv program that supports the
> options that you want.

Do you honestly want your version of iconv(1) to be the odd man out that 
requires such a workaround?

> > for the command-line tool not to recognize e.g. "utf8" when other 
> > well-established implementations do shows a rather troubling emphasis 
> > of pedantry over interoperability.
> 
> If you are after interoperability, you should look at the standards. In the
> case of character set names, it's the IANA list.

Again, you are putting pedantry over practical interoperability concerns. I 
don't want a lesson in proper encoding-name syntax; I just want this thing 
to *work*.

If I were writing a program that identifies and returns an encoding, then 
yes, the output should be in the canonical form. But here, you are saying 
that you care more about making users change their way of working to the 
tool, rather than the tool to the user. That the principle of "be lenient 
in what you accept, and strict in what you produce" holds no water with 
you. Your argument may even have some merit if GNU libiconv were normative, 
and there were not *multiple other implementations* that already follow the 
more flexible approach. You're tilting at windmills, and causing headaches 
for people who expect things to *just work*.

(Which is made all the more egregious by the fact that one of those "other 
implementations" is the glibc version---also under the GNU umbrella! Oh, if 
only that iconv(1) didn't buffer its entire input in memory, thereby 
rendering it useless for large files....)

> People who write "utf8" usually don't do this because they care for
> interoperability. They do it out of sloppiness, or because they don't know
> about the standards.

Or because they see that iconv(1) recognizes it, and it is shorter than 
"utf-8" without losing specificity.

>   - If you allow sloppiness here, where do you stop? People have trouble
>     memoizing "8859", so should iconv also support "iso-8559-1"? Do you do
>     fuzzy matching?

Is the synonym recognized by multiple third-party implementations of 
iconv(1)?

>   - End users should not need to know about the standards, but programmers
>     should. If you have a GUI (menu / combobox) for example, show the user
>     a combination of standard name and explanation, like:
> 
>       ISO-8859-15 (Western with EURO)
>       UTF-8 (Unicode)

This is the programmatic case. I agree with you, but it doesn't apply here.

> > My employer makes use of files named e.g. "en.utf8.dic.txt", 
> > "en.iso8859-1.dic.txt".
> 
> He could also have chosen to prefer even smaller file names, e.g.
> "en.u8.dic.txt" and "en.l1.dic.txt". That would not be a good reason for
> asking that iconv must support the names "u8" and "l1".

The point is to illustrate a case where dropping the hyphen has a 
beneficial result, not an argument for supporting an arbitrary idiom. The 
encoding names were chosen both for their terseness, and for their 
compatibility with other implementations of iconv(1).

> > The toolchain I referred to above uses GNU Make, and with pattern rules,
> 
> Dashes are supported in pattern rules. This works without problems:
> 
> %.utf-8.txt : %.iso-8859-1.txt
>         iconv -f iso-8859-1 -t utf-8 < $< > $@

The extra hyphens are superfluous. Why should they be necessary, when other 
iconv implementations work fine without them? Heck, even "iso8859-1" is 
recognized as it is!


Anyway. I'm not going to argue the point further. If the key words 
"multiple other implementations" aren't a good enough reason for GNU 
libiconv-iconv(1) to recognize these aliases, then I'm not going to make 
any headway here.


--Daniel


-- 
NAME   = Daniel Richard G.       ##  Remember, skunks       _\|/_  meef?
EMAIL1 = address@hidden        ##  don't smell bad---    (/o|o\) /
EMAIL2 = address@hidden      ##  it's the people who   < (^),>
WWW    = http://www.******.org/  ##  annoy them that do!    /   \
--
(****** = site not yet online)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]