lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev coloring with character-set=utf-8


From: Klaus Weide
Subject: Re: lynx-dev coloring with character-set=utf-8
Date: Wed, 28 Jul 1999 23:31:14 -0500 (CDT)

On 29 Jul 1999, Christian Weisgerber wrote:

> Klaus Weide <address@hidden> wrote:
> 
> > [*] Afaik.  One difference though: Lynx does more screen refreshing
> > for UTF-8 output.  Near end of display_page() in GridText.c:
> > 
> >     if (HTCJK != NOCJK || text->T.output_utf8) {
> >         /*
> >          *  For non-multibyte curses.
> >          */
> >         lynx_force_repaint();
> >     }
> > 
> > Does taking this out change things?   
> 
> Yes, that fixes it. Highlighting now works for a display character set
> of UTF-8.
> 
> --- lynx2-8-2/src/GridText.c.orig     Fri May 28 16:04:01 1999
> +++ lynx2-8-2/src/GridText.c  Thu Jul 29 03:45:40 1999
> @@ -1728,7 +1728,7 @@
>      }
>  #endif /* DISP_PARTIAL */
>  
> -    if (HTCJK != NOCJK || text->T.output_utf8) {
> +    if (HTCJK != NOCJK) {
>       /*
>        *  For non-multibyte curses.
>        */

Not that I understand why - maybe Tom has some better ideas.

I added the "more refreshing for output_utf8" in that place a long time
ago, I must have had some reason then...  At that point though there
was no 'lynx_force_repaint();', in its place was just a 'clearok(curscr,
TRUE);'.  Maybe reinstating that for the output_utf8 case instead of
the full lynx_force_repaint() would do some good in some situations -
I don't know.

This reminds me now that all this UTF-8 output stuff works only by
accident anyway...  It fails if the curses lib tries too much optimizing
of output.  If it just throws out the chars in a line as they come
(i.e. as they have been fed to addstr()) - good.  If it tries to e.g.
optimize by replacing repeated blanks in a line with a cursor movement
terminal command - bad, position in the line will be wrong if it happens
after one or more UTF-8 characters.  The clearok() was necessary for a
similar reason as I recall (garbage characters from previous screen
would remain visible when paging through a document with non-ASCII
characters).

Slang seemed to do "better" in that respect than ncurses when I compared
them, i.e. less optimizing - but that was a long time and several
version numbers ago.

Also, lines will be cut short (or maybe wrapped) inappropriately when
the number of chars in a line as curses or slang understands them (NOT
of real displayed "characters") exceeds the screen width.  That's bound
to happen for anything that has more than a few non-ASCII characters per
line.  For slang, there is -DSLANG_MBCS_HACK to work around this problem.
It deceives the slang display system by making it think that the screen
is six times as wide as it really is.  For (n)curses there is nothing
comparable.  Maybe Tom has ideas how to solve this, or knows how other
programs deal with it (if at all), now that there is a UTF-8 xterm these
things should get more attention...

Summary: for any serious use of UTF-8 screen output with lynx, compiling
with slang lib and -DSLANG_MBCS_HACK is still recommended.

Very useful page for testing (including the cutoff/wrap effect):
<http://www.cogsci.ed.ac.uk/~richard/unicode-sample.html>.
    Klaus


reply via email to

[Prev in Thread] Current Thread [Next in Thread]