lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV fixed bug on bsdi


From: Foteos Macrides
Subject: Re: LYNX-DEV fixed bug on bsdi
Date: Sat, 27 Dec 1997 17:23:31 -0500 (EST)

Laura Eaves <address@hidden> wrote:
>I'm not familiar with the general syntax for hrefs, but having looked at
>the code in src/LYCharUtils.c I decided to try to fool the code by planting
>an &nbsp in an href to see what it would do.  I tried ac104 (with my fix)
>and fotemods 12/22/97.  See http://www.intac.com/~leaves/t.html.
>Printing the list page I get
>
>>                         You have reached the List Page
>>                                        
>> Lynx Version 2.7.1fs12/22/97
>> 
>>    References in file://localhost/intac/u4/leaves/public_html/t.html
>>    
>>      * [1]http://www.qvc.com/scripts/detail.dll?Product&A7521
>>      * [2]http://www.qvc.com/scripts/detail.dll?Product&A39204
>>      * [3]http://www.qvc.com/scripts/detail.dll?Product%C2%A0
>> ...
>
>With ac104+fix I get
>
>>                         You have reached the List Page
>>                                        
>> Lynx Version 2.7.1ac-0.104
>> 
>>    References in this document:
>>    
>>      * [1]http://www.qvc.com/scripts/detail.dll?Product&A7521
>>      * [2]http://www.qvc.com/scripts/detail.dll?Product&A39204
>>      * [3]http://www.qvc.com/scripts/detail.dll?Product 
>>...                                                     |
>                                                       (160)
>Perhaps I should rt*m before asking but what should the behavior be for
>        HREF="/scripts/detail.dll?Product&nbsp " ?

        Both lynx271f and the devel code use my LYLegitimizeHREF() in
LYCharUtils.c to strip the ASCII spaces from the ends of those HREFs,
so the third ends with the named entity for a non-breaking space.  The
devel code then has EXP_CHARTRANS #ifdef'ing for using either Klaus'
LYUCFullyTranslateString() or my LYUnEscapeToLatinOne(), but the latter,
though still declared in LYCharUtils.h, is not in the devel code (I
don't know why Klaus left that #ifdef'ing and header declaration
despite having removed the function).  The devel code's
LYUCFullyTranslateString() then, among other things, unescape entities,
so the named entity becomes the raw character with decimal value 160,
which is illegal in URLs.

        The lynx271f code did not incorporate Klaus'
LYUCFullyTranslateString() (because it's an excessively hairy and
unmaintainable mess, IMHO).  It uses my LYUnEscapeToLatinOne() which
also converts the named entity to 160, but then converts that to the
corresponding Unicode/UTF-8 di-byte character, and then the two (8-bit)
bytes to two hex-escaped characters, to make the URL legal in accordance
with current IETF drafts for internationalization of URLs.  Since you
have the 1997-12-22 lynx271f, you could have read this explanation:

[...]
/*
**  This function converts any named or numeric character
**  references in allocated strings to their ISO-8858-1
**  values if they are in the valid ASCII range, and if
**  not, converts them to UTF-8 and hex escapes them.  If
**  the isURL flag is TRUE, it also hex escapes ESC and
**  trims any leading or trailing blanks.  Otherwise, it
**  strips out ESC, as would be done when the "ISO Latin 1"
**  Character Set is selected.  HTChunk functions are use
**  to keep memory allocations at a miminum. - FM
*/
PUBLIC void LYUnEscapeToLatinOne ARGS3(
        HTStructured *,         me,
        char **,                str,
        BOOLEAN,                isURL)
{
[...]

        The "internationalized URLs" drafts have not yet been ratified
by the IETF, and if they are changed before being adopted, what lynx271f
presently does may not remain correct, but retaining 8-bit characters in
URLs definitely is wrong, and seems certain to remain so.

                                Fote

=========================================================================
 Foteos Macrides            Worcester Foundation for Biomedical Research
 address@hidden         222 Maple Avenue, Shrewsbury, MA 01545
=========================================================================

reply via email to

[Prev in Thread] Current Thread [Next in Thread]