[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Correspondence between web-pages and Info-pages
From: |
Kelly Dean |
Subject: |
Re: Correspondence between web-pages and Info-pages |
Date: |
Tue, 30 Dec 2014 11:17:45 +0000 |
Stefan Monnier wrote:
> Hey, I think this is a great idea: replace the "(emacs)Title"
> syntax with a URL. When passed to Info, these URL would be redirected
> to the local Info pages.
>
> The main downside is that those URLs would take up more space. But the
> upside is not just greater exposure of our HTML manuals to search
> engines, but also the removal of the ad-hoc (info "(emacs)Title") syntax.
Don't overlook two important parts of this: using the same name both for user
input and for display, and using different names for different formats of a
page (Info vs. HTML).
Web browsers have some useful navagation features:
0. An address bar, which shows the name of the currently displayed page.
1. A drop-down menu that shows the sequence of visited pages for the current
buffer, and the current position within that sequence.
2. In the address bar, you can enter a new name and press enter to open that
page.
3. The name shown is the same string as the string you enter to open the page
by name.
4. You can copy the name that's shown.
5 Because of the preceding three features, you can save the name into a text
file that you use as a list of bookmarks, paste the name back into the address
bar to return to the page, and use the name to cite the page so your readers
can open it; IOW, you can use the name to link to the page.
6. The name can include a hash mark and section name at the end, so that when
you open the page, the browser jumps to the named section.
Emacs's Info browser has feature #0, but lacks the rest. Emacs's Info-history
command partially provides #1, but doesn't show the actual link sequence that's
traversed by Info-history-back and Info-history-forward. Instead of #2, Emacs
makes you remember a command («g», for Info-goto-node) for entering the name of
the page to open. Regarding #3, for example, I'm currently viewing the page
with the shown name ⌜(elisp)Top > Keymaps > Translation Keymaps⌝[0], but that's
effectively like an HTML page title; it isn't the name used for opening the
page.
([0]: I actually had to manually transcribe that name, because incredibly,
Emacs lacks feature #4. See bug #19471.)
Features #1 and #2 would be nice to have but aren't essential, #4 is essential
but fortunately is easy to implement, and #6 is unnecessary if pages aren't too
long. But the lack of #3, and consequently of #5, is the major problem. If you
adopt URL syntax for page names, be sure to not only use it for Info-goto-node,
but also display it in the address bar in the Info browser, e.g.
⌜http://gnu.org/emacs/24.4/docs/elisp/keymaps/translation_keymaps⌝, regardless
of whatever other syntax (e.g. ⌜(elisp)Translation Keymaps⌝ as the short name)
might also be usable to open the page. For #2, have the address bar be
editable, and have Info-goto-node simply move focus to it.
There was a proposal somewhere in this ginormous thread to use the same name
for both an Info page and an HTML page, and serve the Info page from the local
cache but the HTML page via HTTP from the official server. That's a bad idea,
because then the name's scope isn't global; instead, what the name resolves to
depends on which system (local or remote-official) is queried.
If you try to fix that by relying on the User-agent or some other request
header to choose which format to return, and having Emacs cache and use the
format returned by sending ⌜Info⌝ for that header and having web browsers use
the format returned by sending any other value for that header, then the URL is
no longer the name of the page; instead, the URL+header is the name, which is a
facepalm-inducing convention that's already a widespread plague that Emacs
shouldn't exacerbate, akin to using URL+source-ip for page names in order to
balkanize the web (conspicuous offenders include Google and CloudFlare).
You could instead conflate the protocol name and the page type name and say
⌜info:gnu.org/emacs/24.4/docs/elisp/keymaps/translation_keymaps⌝ if you want.
That would still enable feature #3. Or instead append a ⌜.info⌝ extension to
the end of the name, like is commonly done with HTML, though that could be
misleading if the page doesn't have its own dedicated Info file. Both of these
require you to replace the ⌜info⌝ in the name by ⌜http⌝ or ⌜html⌝ before
sending the name to non-Emacs users.
I propose a cleaner solution: have the name with no type extension resolve to a
redirect. Do client-side redirect, not server-side: serve a consistent response
to all clients (regardless of request headers), containing both a standard HTTP
redirect that web browsers will follow, and a new Info-file header that Info
browsers will follow (web browsers will ignore it). The former points to a page
with the same name but with ⌜.html⌝ appended, and the latter to the Info file
that contains the requested Info page. This way, the extensionless URL is
effectively the name of a directory from which browsers automatically choose
one of two files, but the URL alone, not the URL plus a header, is the name of
the directory, and the files have their own URLs.
When you receive page URLs from non-Emacs users, it's easy enough to chop off
the ⌜.html⌝ extension. When you send them page URLs without the extension,
their browsers will automatically redirect.
For example, if your browser (web or Info) sends this query for a documentation
page:
GET /emacs/24.4/docs/elisp/keymaps/translation_keymaps HTTP/1.0
Host: gnu.org
then the response is:
HTTP/1.0 302 Found
Location: http://gnu.org/emacs/24.4/docs/elisp/keymaps/translation_keymaps.html
Info-file: http://gnu.org/emacs/24.4/docs/elisp.info
Web browsers will redirect to the URL in the Location header.
Info browsers will:
Fetch the file named in the Info-file header.
Chop the ⌜.info⌝ extension from the value of the Info-file header to get
Info-base.
Chop Info-base from the front of the original page URL to get the name of the
page (⌜/keymaps/translation_keymaps⌝ in this case) within the Info file.
Load that page from the file.
In the address bar, display the original page URL.
Info can send all web requests through a cache. Distribute Emacs with the cache
preloaded with Info files, including the original URL for each of those files.
When Info queries the cache for a cached Info file, the cache returns a file
descriptor for that file.
When Info queries for a noncached Info file, the cache downloads and caches it
and returns a descriptor.
When Info queries for for any URL that starts with a string matching the URL of
a cached Info file (excluding the ⌜.info⌝ extension), and the query URL itself
doesn't have a filename extension, the cache generates and returns ⌜Info-file:
X⌝ where X is the URL of the Info file. Info then processes this as a redirect.
When Info queries for anything else, the cache sends the query to the named
server and returns the response to Info. If the response is a redirect, Info
processes it.
This way, no network traffic is necessary for cached files. This also lets the
same cache serve web browsers, not just Info browsers. The cache could be
preloaded with HTML files for people who really don't like Info, and both Info
and HTML files for people who like both. The cache doesn't need to be a server;
it can just be a library, like sqlite is, and integrated into Emacs if only
Info and Eww use it.
Indirecting through the Info-file header enables splitting or combining Info
files without affecting the page URLs. E.g. elisp.info could be split up so
that «keymaps», etc are in separate files, or elisp.info, emacs.info, and all
the other Info files could be combined into one big docs.info file, but with
either of those changes, the page URLs would remain unchanged.
It doesn't matter whether URLs are used in Info files (or in Texinfo files), or
the Info browser just translates the names for input and display. What matters
for users is just the Info browser's UI. But if Info files use only relative
names, then the browser must know the original URL of the file in order to
construct the URL for each page and show that name in the address bar.
Therefore, the browser can't just search a path on the local system to find
Info files, like it currently does when the user runs e.g. ⌜(info "(elisp)")⌝,
unless the file format is changed to include its own URL. Alternatively, and
more cleanly, the browser could just query the cache and have the cache do the
search, and the cache can return ⌜Info-file: X⌝ if it finds a match, which Info
then processes as a redirect.
For any query without a version number embedded in the name, the server should
respond with a redirect to the same name but with the latest version number
embedded. This makes it easy to check for updates, and to link to the
always-latest version of a page.
For non-English manuals, there's no need to embed the language name in the URL;
just use the source-ip address of the request to choose which version to serve,
like Google does. (Just checking if anybody is still awake.)
- Re: On being web-friendly and why info must die, (continued)
- Re: On being web-friendly and why info must die, David Kastrup, 2014/12/12
- Re: On being web-friendly and why info must die, Eric S. Raymond, 2014/12/12
- Re: On being web-friendly and why info must die, David Kastrup, 2014/12/12
- Correspondence between web-pages and Info-pages, Stefan Monnier, 2014/12/12
- Re: Correspondence between web-pages and Info-pages, David Kastrup, 2014/12/12
- Re: Correspondence between web-pages and Info-pages, Stefan Monnier, 2014/12/12
- RE: Correspondence between web-pages and Info-pages, Drew Adams, 2014/12/12
- Re: Correspondence between web-pages and Info-pages,
Kelly Dean <=
- RE: Correspondence between web-pages and Info-pages, Drew Adams, 2014/12/30
- RE: Correspondence between web-pages and Info-pages, Kelly Dean, 2014/12/31
- Re: Correspondence between web-pages and Info-pages, Eli Zaretskii, 2014/12/30
- Re: Correspondence between web-pages and Info-pages, Kelly Dean, 2014/12/31
- Re: Correspondence between web-pages and Info-pages, Stefan Monnier, 2014/12/30
- Re: Correspondence between web-pages and Info-pages, Kelly Dean, 2014/12/31
- Re: On being web-friendly and why info must die, Phillip Lord, 2014/12/12
- Re: On being web-friendly and why info must die, martin rudalics, 2014/12/12
- Re: On being web-friendly and why info must die, Phillip Lord, 2014/12/12
- Re: On being web-friendly and why info must die, martin rudalics, 2014/12/12