lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev Re: lynx should respect LANG


From: Hataguchi Takeshi
Subject: Re: lynx-dev Re: lynx should respect LANG
Date: Wed, 7 Jun 2000 22:46:39 +0900 (JST)

On Sun, 4 Jun 2000, Klaus Weide wrote:

> See the shownonascii script that comes with metamail, which tries to spawn
> an xterm with the required font if necessary.  You should also find some
> example mailcap entries that use it.  Of course this works only under X,
> and it has (in my version, at least) 'xterm' hardwired in while you probably
> want something else like 'kterm' for Japanese.  But the script could be
> customized and extended.

I saw the shownonascii script and finally understood you're right!
There seems to be two reasons why I had made a mistake.  One is I
refered only metamail's man and the other is I considerd only Japanese
encodings.

> Is it true that all terminal emulators that understand euc-jp also
> understand iso-2022-jp directly (without the user having to switch modes)?
> I checked kterm, kon, and krxvt, they all seem to, at least for JISX0208
> characters.

The answer is no.

I tested some terminal emulators by catting a file which includes
strings "^[$B$[$2^[(B" encoded by iso-2022-jp.

    Tera Term Pro 2.3 on Windows98-J
        Kanji receive/transmit = EUC or SJIS
            I saw the string correctly.

    TELNET.EXE, which is bundled with Windows, on Windows98-J
        Kanji Codeset = EUC or SJIS
            I saw a unexpected string like "B$[$2B"
            # ESC and the next character seem to go away!

        Kanji Codeset = JIS
            I saw the string correctly (consequent)

    BOWPAD, which is a terminal emulator included in BSD on Windows, 
    on Windows98-J
        Kanji Code = EUC or SJIS
            I saw a unexpected string like "B$[$2B"

        Kanji Codeset = JIS
            I saw the string correctly (consequent)

    HTERM on MS-DOS
        Kanji Code = EUC, SJIS or JIS
            I saw the string correctly.

> Also, just out of curiosity, is kterm what you normally use?

I usually use above 3 terminal emulators except TELNET.EXE and kterm.

> > iso-2022-jp shouldn't be a synonym for euc-jp. So Lynx should ignore
> > iso-2022-jp in this case.
> 
> It used to be, for all practical purposes that I can see, before your
> changes that are now in 2.8.3.  You have effectively disabled recognition
> of "charset=ISO-2022-JP" and "charset=ISO-2022-JP-2", which is not so
> good.  

I agree that disabling recognition of those two charsets isn't so good.
I just thought it's worse that treating them as synonyms for euc-jp.

> Lynx *should* recognize documents with such an explicit charset
> as Japanese.  I thing the previous behavior, although not the most
> correct, was better; I have changed this in my code.

Of course I agree the first sentence.

> > > But normally, I imagine you probably just wouldn't set MM_CHARSET as a
> > > Japanese user.
> > 
> > Why?  I watnt to recommend Japanes metamail user to set MM_CHARSET to
> > iso-2022-jp not to show warnings of metamail.
> 
> Okay, I didn't consider that properly.  So using MM_CHARSET in the way
> I proposed may not be such a good idea.  At least for Japanese...
> I still think it makes sense for everything but Japanese charsets.

I agree.

> > But please note when we open a Japnese file with mule (Multilingual
> > Enhancement to GNU Emacs), we don't have to tell the charset to mule
> > [...]
> > I can say almost same thing about nkf (Network Kanji filter).  I think
> > detect routine of Lynx for Japanese may be as good as those of mule
> > and nkf.
> 
> There are still situation where autodetection fails, and where it helps
> to tell the program what the input is.  I think you agree that that's
> not just theoretical, otherwise you wouldn't have added the changes
> for -assume_charset handling.

You're right.  But I'm not sure which is more likely "autodetection
fails" and "LANG=euc-jp and novice user has SJIS files".

> Do you, and other Japanese users, actually toggle '@' when you visit
> non-Japanese (non-CJK) sites?  Do you know you're supposed to?
> Or do you just not care enough about anything but Japanese and 7-bit
> ASCII characters?

I didn't care enough about it.  I tried test/iso-8859-1.html toggling 
'@' and knew the difference today.
--
Takeshi Hataguchi
E-mail: address@hidden

; To UNSUBSCRIBE: Send "unsubscribe lynx-dev" to address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]