bug-teseq
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-teseq] codesets are not shown


From: Micah Cowan
Subject: Re: [Bug-teseq] codesets are not shown
Date: Tue, 05 Aug 2008 16:50:53 -0700
User-agent: Thunderbird 2.0.0.16 (X11/20080724)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bruno Haible wrote:
> Hi Micah,
> 
> Nice tool you created! It even recognizes the xterm color-choice sequences.

Thanks Bruno! :)

> Find attached a few snippets from ISO-2022 based encodings. teseq does not
> show which codesets are chosen. Example:

<snip>

> It would be nice if it showed, as a comment/annotation, that
>   Esc $ ) A  switches G1 to GB2312
>   Esc $ ) G  switches G1 to CNS11643 plane 1
>   Esc $ + I  switches G3 to CNS11643 plane 3
>   Esc $ * H  switches G2 to CNS11643 plane 2
> etc.
> 
> You find the complete list of associations between escape sequences and
> codesets at the official ISO IR registry:
>   http://www.itscj.ipsj.or.jp/ISO-IR/
> especially
>   http://www.itscj.ipsj.or.jp/ISO-IR/2-9.htm
> in combination with the list at
>   http://www.itscj.ipsj.or.jp/ISO-IR/overview.htm

Yeah: I actually keep local copies of that around, as I refer to them
fairly regularly.

It was a conscious decision not to include descriptions for each
registered character set defined in the IR registry. Mainly motivated by
laziness ;)

The single-byte encodings have some coverage, but also not complete, and
also not particularly distinguished. For example, the final bytes 5/10
and 6/8 are both reported as designating "Spanish" charsets, without any
distinguishing information between them.

Also in my mind, was that in typical multibyte charset usage, the user
will know whether the multibyte charset being designated is Chinese,
Japanese, Korean or whatever (though, of course, they won't know which
plane will be invoked, unless they knew the final bytes anyway).

So basically, I didn't particularly want to spend the time on it (even
though it's really not much time involved), at least not to hold up the
initial release. My main focus was getting something that (a) met my
personal needs, and (b) was reasonably polished enough. :) I'll probably
put it in at some point, though, and I'll be ecstatic to accept any
patches anyone might offer to provide these details.

.

What'd be potentially even cooler would be if Teseq remembered the
currently-invoked charsets, and used them in rendering the content of
the text lines (that's discussed in the Future Enhancements section of
the manual). Then it's obvious what characters are being printed to Ecma
48 / ISO 2022-capable terminals. It might be even slightly more
clarifying for Ecma 6 / ISO 646 character sets that happen not to be
US-ASCII. :)

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
GNU Maintainer: wget, screen, teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFImOdd7M8hyUobTrERAjpcAJ0RrBta9t9V1iXJNEhbNEBGOMgLjwCfUkv+
mm+Nruh378YHHzBn+/Q+WLQ=
=kr+T
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]