lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV minor display problem (?character 0xA2?)


From: Klaus Weide
Subject: Re: LYNX-DEV minor display problem (?character 0xA2?)
Date: Fri, 2 May 1997 08:13:11 -0500 (CDT)

On Fri, 2 May 1997, Bela Lubkin wrote:

> http://www.demos.su/ -- on an 80-column display, the first line of the
> display contains four links.  (I can't read the link names due to
> character set (and language :-) problems, but that's irrelevant.)
>[...] 
>   | link1-win | link2-koi | link3-dos | lilink4-iso
>                                           ^^^^^^^^^
> This is on SCO OpenServer Release 5.0.0, various versions of Lynx all
> built with SCO cc and SCO libcurses.  It occurs when using terminal
> types "vt102", "ansi", or "ansi_ncc" (SCO ansi with no color); it does
> not occur when using "wy60".  Lynx set up for IBM PC character set, raw
                                                ^^^^^^^^^^^^^^^^^^^^
> mode off.  Fooling with the character set changes the behavior in
> non-obvious ways.
> 
> The problem did not occur with these older binaries: 2.5,
> 2.5FM-96-07-19, 2.6, 2.6RP.  It does occur with these binaries: 2.7-PRE,
> 2.7ac-0.31, 2.7.1ac-0.19.

> ...
> An hour later: it's because character 0xA2 is eventually being
> translated to 0x9b on output.  The SCO ANSI console takes 0x9b as CSI,
                                     ^^^^^^^^^^^^^^^^
> Control Sequence Initiator, same as ESC [.  This is pretty standard
> behavior for ANSI terminals of various sorts.  

The IBM PC character set (cp437) contains a visible character at that
position.  Since you are telling Lynx that that is your display character
set, when apparently it is not, you shouldn't be surprised about unusual
results.

> Not sure what can be done
> about it.

The texts of those links is the Russian for "Russion Version", repeated
5 times, each time in a different character set (code page).
Theey are there for the human reader, 'coz the human reader will see 
those links and a maximum of one of the 5 versions will make sence to
the reader (did I mention those character sets are all incompatible?)
and the reader will then follow that one link.  There's no way Lynx
can do anything like understand those texts and choose for you a
character set.  The page where this appears isn't labeled as anything
special character-set-wise, so it's assumed to be iso-8859-1.  So
are the 5 two-word texts.  Of course we know that's not true, but
how should Lynx know.   It does what it is told and translates
iso-8859-1 cod points to cp437 code points, without any idea that there is
anything Cyrillic involved, that characters are not meant to be what the
page sayz they are, or that poor Lynx is not configured for the right
display character set anyway. 

Translation from iso-8859-1 to display character set occurs with both
non-chartrans and chartrans Lynx.  But with the latter you have more
options.   Try

  -assume_charset=iso-8859-5 http://www.demos.su/

and select for example "7 bit characters" or one of the RFC1345 things
near the end of the list, and one of the 5 versions will become nearly
readable.  Do the same with -assume_charset=koi8-r, and it will be
another one.

   Klaus

;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]