lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev reading sjis docs [was Re: lynxcgi problem]


From: Henry Nelson
Subject: Re: lynx-dev reading sjis docs [was Re: lynxcgi problem]
Date: Tue, 28 Dec 1999 22:25:01 +0900 (JST)

> By the way, I'm wondering ASSUME_CHARSET doesn't work for Japanese
> as expected now as you've ever wrote.
> Do you know the relationship between ASSUME_CHARSET and 
> "kanji code", which can be changed by ^L with SH_EX?

ASSUME_CHARSET is turned off for CJK, as far as I know.  Our LAN service
is very unstable right now, so I cannot try to search the archives for you,
but look in the "http://www.flora.org/lynx-dev/html/month1097"; archives,
and grep for "did something happen to."  Or obtain the whole month's archive
from me: "http://www.irm.nara.kindai.ac.jp/lynxdev/archives/9710.arc.gz.
Don't bother reading my posts; only read those by Klaus Weide.  You may find
a few hints.  Klaus and Leonid Pauzner are probably the only two people
besides Hiroyuki Senshu who could help you in this area.

My *hunch* is that ASSUME_CHARSET would not offer much to help Lynx render
Japanese documents.  How can you assume?

But rather than waste your time because of my ignorance, I have put four
examples of a "problem page" on my server.  They contain a mixture of
three encodings, one with no meta tag, and three with meta tags as indicated.
They have the same content except for the meta.
        http://163.51.110.11/lynxdev/docs/mix-nometa.html
        http://163.51.110.11/lynxdev/docs/mix-eucmeta.html
        http://163.51.110.11/lynxdev/docs/mix-isometa.html
        http://163.51.110.11/lynxdev/docs/mix-sjismeta.html

Lynx cannot render this page after the third line of the "Longer text"
under "* SJIS."  "nkf -Se" processing improves the rendering, but of course
is not perfect (lose EUC).

My question is why the last of those with the "charset=x-sjis" meta is not
rendered "properly" without having to use the SH_EX ^L => SJIS manual
switch.  (Or using the pseudo-proxy method I prefer.)

> If this is right, I think ASSUME_CHARSET should work properly.
> # "Japanese (Auto Detect)" should be added in the list, if needed.
> Don't you agree with me, Henry?

Sorry, but I just don't know.  My "gut feeling" is that "Japanese (Auto
Detect)" should be the default unless Lynx can determine from the server
header or a meta definition what the character encoding is, and in that
case set the document character encoding to what it has determined.

Another point I don't understand is how changing the document character
set from the form-based O)ption Menu is/should be different from the ^L
switch.

> Though I don't have a Win32 version which is enabled to use lynxcgi, 
> I tried it under Cygwin and found it works.
> I think it works also in the environment Cygwin isn't installed, 
> if you have sed and xargs.
> --
> Takeshi Hataguchi
> E-mail: address@hidden
> 
> @echo off
> echo Content-type: text/html
> echo\
> PATH=d:\home\patakuti\bin;c:\cygnus\cygwin-b20\H-i586-cygwin32\bin;c:\cygnus\cygwin-b20\H-i586-cygwin32\usr\local\bin;%PATH%
> echo %QUERY_STRING% | sed 's/\/sj//' | xargs lynx -source | nkf -Se

Excellent!  So it can be done on a PC-based Lynx.  This is perhaps the first
such batch file for this platform.  I hope others will be encouraged to
experiment.

__Henry

PS I will be in a half-day tomorrow, and then I will be gone for close to
   a month.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]