emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Orgmode] Re: Custom entry IDs in HTML export


From: Sebastian Rose
Subject: Re: [Orgmode] Re: Custom entry IDs in HTML export
Date: Fri, 17 Apr 2009 00:37:49 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.92 (gnu/linux)

Carsten Dominik <address@hidden> writes:
> On Apr 16, 2009, at 10:50 PM, Sebastian Rose wrote:
>
>> Carsten Dominik <address@hidden> writes:
>>> Hi Sebastian,
>>>
>>> On Apr 16, 2009, at 3:14 PM, Sebastian Rose wrote:
>>>
>>>> Hm - counter arguments?
>>>>
>>>> The only counter argument is, that hand made IDs for links are prone to
>>>> error. But that risk should be up to the user.
>>>
>>> Yes.  and during the export, I can actually check and throw a warning or an
>>> error if the same custom ID shows up twice.
>>>
>>>>
>>>> I actually changed my mind a little in this concern.
>>>>
>>>> If the user clicks a section link in the toc to jump to a section, he
>>>> can bookmark the page with exactly that jump target. If the jump target
>>>> (the ID) is human readable, the bookmark is more verbose.
>>>
>>> Yes, this is really the best application.  Also, when hovering over internal
>>> links, it is helpful if the link displays the human-readable  form.
>>>
>>>> Just one wish:
>>>>
>>>> The containers should reflect that change (HRID = human readable id):
>>>>
>>>> <div   id="outline-container-HRID">
>>>> <h4  id="HRID">                   headline    </h4>
>>>> <div id="outline-text-HRID">
>>>>   sections content...
>>>> </div>
>>>> </div>
>>>
>>>
>>> Sure, we can do this.  I would then add sec-xxx as one
>>> of the alternative anchors as well.
>>>
>>> However:  If I make the structure as you indicate above,
>>> do I understand correctly that the structure of a section without a
>>> human-readable id should be changed to this:
>>>
>>> <div   id="outline-container-sec-1.1">
>>> <h4  id="sec-1.1">                   headline    </h4>
>>> <div id="outline-text-sec-1.1">
>>>   sections content...
>>> </div>
>>> </div>
>>>
>>>
>>> Note the "sec-" which is added to the stuff that currently
>>> defines the structure.
>>
>>
>>
>> I considered the `sec-' part of the automatic IDs.
>>
>> In either case I'd have to adjust org-info.js. So why not go for the
>> human readable IDs without `sec-'?
>>
>>
>> Right now we have:
>>
>> <div id="outline-container-2" class="outline-2">
>> <h2 id="sec-2"><span class="section-number-2">2</span> Things I want to find
>> out </h2>
>> <div class="outline-text-2" id="text-2">
>>
>> The `sec-' part is in the headlines ID only.
>
>
> Why?  Because this introduced a parsing inconsistency for you between 
> automatic
> and custom IDs.  Because for the automatic ones, you need to  strip "sec-" to
> retrieve the correct suffix for the container etc  names.  With the custom 
> IDs,
> no such stripping should be done.  Does  this not make things harder?
>
> - Carsten


That's the way it is _now_. The structure above is taken from one of my
exported org-files. But it's not that hard to strip `sec-' :)

Now the scanning considers `sec-' a prefix - just like
`outline-container-' and `outline-text-'.


But in the future:


If we now plan to use human readable IDs in the TOC, those IDs would be
the IDs of the section heading. That's why those IDs should have no
`sec-' prefix.

Otherwise, bookmark URLs would not be what we want them:

   http://orgmode.org/org-faq.php#sec-isearch-in-links

 instead of

   http://orgmode.org/org-faq.php#isearch-in-links



Automatic IDs on the other hand must have a prefix, since an ID may
_not_ start with a number.


So wouldn't it make sense, to change the IDs of the containers this way:

  Case _automatic_:

       <div id="outline-container-sec-1.1" ... >
         <h3 id="sec-1.1"> .... </h3>
         <div id="outline-text-sec-1.1" ... >
         ....
         </div>
       </div>

  Case _human-readable_:

       <div id="outline-container-isearch-in-links" ... >
         <h3 id="isearch-in-links"> .... </h3>
         <div id="outline-text-isearch-in-links" ... >
         ....
         </div>
       </div>

??


  Sebastian



>>
>>
>>
>>   Sebastian
>>
>>
>>
>>
>>>> That way the script would keep working with older pages.
>>>> Automatic IDs and human readable ones could be mixed.
>>>>
>>>>
>>>> The '<a id="">' anchors are scanned anyway, as are all jump targets in
>>>> the page.
>>>
>>> Yes, you implemented that some time ago, I remember.
>>>
>>>>
>>>> Maybe this is even the point to re-work the parser of org-info.js to
>>>> become independent of the TOC at all. The script could search for
>>>> headings instead. That's more work, but the script would then work for
>>>> all HTML pages with a structure similar to the org-export's one:
>>>
>>> So this would mean, we could read web pages with your java
>>> support even if those webpages were not created with Org?
>>> Pretty cool.
>>>
>>>> <div id=""><hx id=""></hx><div>content</div></div>
>>>>
>>>> but I could postpone this, if you fullfill my wish above.
>>>
>>>
>>> Best wishes
>>>
>>> - Carsten
>>>
>>>>
>>>>
>>>> Best wishes
>>>>
>>>> Sebastian
>>>>
>>>>
>>>>
>>>>
>>>> Carsten Dominik <address@hidden> writes:
>>>>> On Apr 16, 2009, at 10:50 AM, Sebastian Rose wrote:
>>>>>
>>>>>> Carsten Dominik <address@hidden> writes:
>>>>>>> Hi Sebastian,
>>>>>>>
>>>>>>> I kind of like the idea to have a property that can be
>>>>>>> used to set an ID, as an alternative to the <<target>>
>>>>>>> notation.  Actually, using a property seems a lot cleaner,
>>>>>>> thanks for coming up with this idea, Daniel.
>>>>>>>
>>>>>>> I can also follow the reasoning that it is useful to have
>>>>>>> the table of contents link to the human-readable id, because
>>>>>>> it provides a general, simple workflow to retrieve a link that
>>>>>>> will persist through changes of the document.  This workflow
>>>>>>> was described also by Bernt earlier in this thread.
>>>>>>>
>>>>>>> Finally, I also agree that the main id in the <h3> tag
>>>>>>> should be the automatically generated one because this is
>>>>>>> best for automatic processing and because of all the arguments
>>>>>>> you have presented.
>>>>>>>
>>>>>>> Would it cause problems for org-info.js if the toc points to
>>>>>>> a user specified anchor in the headline, instead of the main
>>>>>>> ID that is inside the <h3> tag?  THis would really be the only
>>>>>>> required change.
>>>>>>
>>>>>>
>>>>>> I'll have to test this before I can give a final answer to this
>>>>>> question.
>>>>>>
>>>>>> But regardless of the results, I will adjust the script to reflect that
>>>>>> change. The script should not rule the HTML export and it will be an
>>>>>> easy thing to do.
>>>>>
>>>>> But I do want to hear any counter arguments you might have....
>>>>>
>>>>> - Carsten
>>>>>
>>>>>>
>>>>>> Sebastian
>>>>>>
>>>>>>
>>>>>>
>>>>>>> - Carsten
>>>>>>>
>>>>>>>
>>>>>>> On Mar 30, 2009, at 1:49 PM, Daniel Clemente wrote:
>>>>>>>
>>>>>>>> El dv, mar 27 2009, Sebastian Rose va escriure:
>>>>>>>>>
>>>>>>>>> What we have now, just as Carstens said:
>>>>>>>>>
>>>>>>>>> # <<human-readable>>
>>>>>>>>> * Section B
>>>>>>>>>
>>>>>>>>> Creates this headline in HTML:
>>>>>>>>>
>>>>>>>>> <h2 id="sec-2"><a name="human-readable" id="human-readable"></
>>>>>>>>> a>2 Section B
>>>>>>>>> </h2>
>>>>>>>>>
>>>>>>>>> This is enough for all the use cases I can think of.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, this is enough except for two things:
>>>>>>>> 1. The TOC still links to #sec-2 and the user can't change that
>>>>>>>> 2. Your syntax doesn't fold very well in the outliner. I mean: if you
>>>>>>>> use
>>>>>>>>
>>>>>>>>> # <<human-readable>>
>>>>>>>>> * Section B
>>>>>>>>
>>>>>>>> then the comment appears at the end of the previous section, and you 
>>>>>>>> can
>>>>>>>> miss
>>>>>>>> it when you are viewing the heading „Section B“. I  would swap both
>>>>>>>> lines
>>>>>>>> (solution 1):
>>>>>>>>
>>>>>>>>> * Section B
>>>>>>>>> # <<human-readable>>
>>>>>>>>
>>>>>>>> But since there are already LOGBOOK drawers under the heading, it would
>>>>>>>> be
>>>>>>>> a
>>>>>>>> lot clearer to use a property, like EXPORT_ID (solution 2):
>>>>>>>>
>>>>>>>>> * Section B
>>>>>>>>> :PROPERTIES:
>>>>>>>>> :EXPORT_ID: human-readable
>>>>>>>>> :END:
>>>>>>>>
>>>>>>>>
>>>>>>>> In this way, the TOC can reliably find the EXPORT_ID, and then 
>>>>>>>> generate:
>>>>>>>>> <h2 id="sec-2"><a name="human-readable" id="human-readable"></
>>>>>>>>> a>2 Section B
>>>>>>>>> </h2>
>>>>>>>>
>>>>>>>> (You could also leave *just* the human-readable id, but having two is
>>>>>>>> not
>>>>>>>> bad.
>>>>>>>>
>>>>>>>>
>>>>>>>> I would prefer solution 1, but I don't because I'm not sure that the 
>>>>>>>> TOC
>>>>>>>> can
>>>>>>>> find the ID if it is written as a comment anywhere under  the heading
>>>>>>>> (and
>>>>>>>> together with other things).
>>>>>>>>
>>>>>>>> Solution 2 involves thus: a new property to specify the human-
>>>>>>>> readable entry ID, which will be used to link to the entry. The
>>>>>>>> automatic
>>>>>>>> ID
>>>>>>>> (#sec-2) will still work for all entrys.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> * Distinguishing automatic and human readable IDs
>>>>>>>>>
>>>>>>>>> One thing I like is, that we now _can_ distinguish the
>>>>>>>>> `human-readable-target' (human readable) from the `sec-2' (not human
>>>>>>>>> readable and not context related) using a regular expression.
>>>>>>>>>
>>>>>>>>> In org-info.js, I can now prefere the human readable ID in <a> from an
>>>>>>>>> automatic created one, and thus use that to create the links for `l'
>>>>>>>>> and `L'. The same holds true for other programming languages and
>>>>>>>>> parsers.
>>>>>>>>>
>>>>>>>>> If we open the <h3>'s ID for user defined values (bad), we can not
>>>>>>>>> distinguish those ID's using a regular expression and there is no way
>>>>>>>>> to detect the human readable one. There will be no way to _know_ that
>>>>>>>>> the <a>'s ID is the prefered one used for human readable links.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Solution 2 doesn't break the parsing techniques you use; in fact it can
>>>>>>>> also
>>>>>>>> make clearer which ID is the human readable one and which  one not.
>>>>>>>>
>>>>>>>>
>>>>>>>> This is not extremely important; just useful:
>>>>>>>> - for pages with many incoming links from external sites
>>>>>>>> - to ensure link integrity (now you can't assure that links will still
>>>>>>>> work
>>>>>>>> in
>>>>>>>> 1 year ... or in some weeks)
>>>>>>>> - to avoid that HTML visitors get directed to a wrong section and can't
>>>>>>>> find
>>>>>>>> what they searched
>>>>>>>>
>>>>>>>>
>>>>>>>> Greetings,
>>>>>>>> Daniel
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Emacs-orgmode mailing list
>>>>>>>> Remember: use `Reply All' to send replies to the list.
>>>>>>>> address@hidden
>>>>>>>> http://lists.gnu.org/mailman/listinfo/emacs-orgmode
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
>>>>>> Tel.:  +49 (0)511 - 36 58 472
>>>>>> Fax:   +49 (0)1805 - 233633 - 11044
>>>>>> mobil: +49 (0)173 - 83 93 417
>>>>>> Email: address@hidden, address@hidden
>>>>>> Http:  www.emma-stil.de
>>>>>
>>>>
>>>> --
>>>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
>>>> Tel.:  +49 (0)511 - 36 58 472
>>>> Fax:   +49 (0)1805 - 233633 - 11044
>>>> mobil: +49 (0)173 - 83 93 417
>>>> Email: address@hidden, address@hidden
>>>> Http:  www.emma-stil.de
>>>
>>
>> --
>> Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
>> Tel.:  +49 (0)511 - 36 58 472
>> Fax:   +49 (0)1805 - 233633 - 11044
>> mobil: +49 (0)173 - 83 93 417
>> Email: address@hidden, address@hidden
>> Http:  www.emma-stil.de
>

-- 
Sebastian Rose, EMMA STIL - mediendesign, Niemeyerstr.6, 30449 Hannover
Tel.:  +49 (0)511 - 36 58 472
Fax:   +49 (0)1805 - 233633 - 11044
mobil: +49 (0)173 - 83 93 417
Email: address@hidden, address@hidden
Http:  www.emma-stil.de




reply via email to

[Prev in Thread] Current Thread [Next in Thread]