lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV Internal MIME types


From: Christopher R. Maden
Subject: Re: LYNX-DEV Internal MIME types
Date: Tue, 29 Apr 1997 17:37:55 GMT

[Klaus Weide]
> If parsing is not expensive, the same effect (avoiding re-fetch) can
> be had with a cache of the raw data.  So I don't think the advantage
> here is enough to generate "excitement".  It probably has more to do
> with sharing the data structure (the "tools" you also mentioned),
> and with reuse of code?
> 
> Also, have you considered that such a stored structure would need
> some rules about its "cache" aspects?  I.e. does it expire, when
> does it have to be flushed.  A "normal" cache of raw data from the
> network might be better with those things.

The structure(s) would only be stored in memory; between Lynx sessions
(and indeed, after a certain number of new documents were requested),
it would be purged.  In addition, that structure would contain any
metadata, like Expires headers, that might be relevant in the short
term.

> But somebody still needs to provide the information *in* the
> document, in a usable format.  HTML already does a reasonable job of
> saying "This IS a picture (but you cannot see it)".  With XML people
> can write <image-of-a-puppy>, instead of <img>.  But there must be
> something more...

Well, images are always going to be a problem.  They still require a
human to translate image-to-text.  The benefit of generic markup is
more plainly evident in describing structures; the author or publisher
decides what information about the markup is relevant, and includes
it.  For instance, usually it's not necessary to label individual
words, phrases, and sentences.  But for some scholarly purposes, it is
useful - so they do it!  You can't do that with any one fixed DTD,
like HTML.

> I think this is relying heavily on authoring tools and that they
> will do the right thing, right?

Not heavily.  It's pretty simple for humans to create XML themselves.
No one is anticipating that XML will replace HTML, only that it will
provide another, richer, markup alternative when HTML is insufficient.

> Not about the WAI, but about these standardized "Object Models".
> Just an ill-expressed, and vague, concern that they have something
> to do with technologies that want to make the whole Internet one
> large desktop (but the client loses control over what data are
> exchanged, and user loses control over client).

Ah.  No, the DOM is just a programmatic interface to a parsed document
structure, standardized for interoperability of tools.  I suppose a
canonical file form could be developed and sent "premasticated" over
the 'net, but that's not the main intent.  The DOM is a structure that
a parser creates after receiving a document, to which other routines
can have access.

Another argument for caching the DOM instead of (or in addition to)
the source is entity expansion.  Picture a large, modular document
like this:

<book>
<title>My Life</title>
&chap1;
&chap2;
...
</book>

Lynx could fetch this, build an internal structure, apply the
stylesheet, and throw away the structure.  The entity references would
be expansion links.  The user requests one; now Lynx must fetch the
entity, rebuild the structure from the initial file, insert the
structure of the entity, reapply the stylesheet, and render.  Compare
that with fetching the entity and inserting its parsed structure in
the existing structure, and now think about that as various entities
are requested in various orders.

I am really a parsing guy and don't know a lot about network
friendliness, so all of these comments and suggestions are great.  If
there's something wrong with my model above, I'd really like to now
ASAP, before I try to implement something really stupid.

-Chris
-- 
Christopher R. Maden                  One Richmond Square
DynaText SIT Technical Support        Providence, RI 02906 USA
Inso Corporation                      +1.401.421.9550 (voice)
Electronic Publishing Solutions       +1.401.521.2030 (facsimile)
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]