emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: stability of toc links


From: Nicolas Goaziou
Subject: Re: stability of toc links
Date: Sun, 02 May 2021 14:10:12 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Hello,

Timothy <tecosaur@gmail.com> writes:

> Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
>
>> I pointed out some concerns I have about the robustness of this system
>> already. I don't think you answered to any of them. I fear we may be
>> communicating past each other in this thread.
>
> Sorry about that. I'll try to address the bits I've missed in these last
> few emails.

Please note that those short answers did not help me much. So I did my
homework and looked at your code. I didn't test it thoroughly, so I may
be missing something.

>> references consist of alphanumeric characters only, so they are /de
>> facto/ compatible with any target format;
>
> This is uses characters from [a-z0-9-]

Indeed. I didn't know about punycode. It has very interesting
properties.

Now, here's the elephant in the room: "puny.el" was included in Emacs
26.1. Org cannot make use of it yet.

Also, the bootstring algorithm, and yours, are very much
English-centered, as can attest
`org-reference-contraction-stripped-words'. I insisted on non-latin
languages for a reason:

       (org-reference-contraction "こんにちは") =>  "28j2a3ar1p-"

or, for a not so long title

  (org-reference-contraction "こんにちは コンニチハ") => "v8ttbvbva7si998jvba0bzb0m-"

which is arguably worse than "org1234567".

>> references are guaranteed to be unique in the document;
>
> The suffixed number I mentioned ensures this.

Unfortunately, because of them, you cannot guarantee stable links during
export, much like random references.

For example, if you first export

  * Foo
  bar

and if you later modify your document like this

  * Foo
  baz
  * Foo
  bar

your link will now point to the "baz" contents instead of "bar". 

As a side note, this the reason why I introduced randomness in
references in the first place. We cannot reference first headline as
"headline-1", second one as "headline-2", i.e., in a monotonic way,
because we cannot assume their order is fixed.

More importantly, the above is not limited to headlines with the exact
same title. Since your algorithm truncates output, this will happen in
various, less obvious, situations.

>> cross-references between documents are stable.
>
> I'm not quite sure what to make of this.

Since you don't implement something new but re-use the existing caching
mechanism, I don't think this is an issue.

>> Also, header content is not stable enough: when you're linking to the
>> custom ID, you may be able to change the title and yet preserve the
>> link.
>
> Custom IDs still work, so I don't quite see the point here.

How can you be sure?

The point is that in some export back-ends, e.g., ASCII, you will only
provide a single reference for a headline, i.e., not one for the title
and another one for the custom ID. If your reference is based solely on
the title, the reference will break whenever you modify the title
without touching custom ID. I gave an example in an earlier post
already. This is a regression wrt the current system.

In a nutshell:

- there are very interesting points in your proposal;

- it is not applicable at the moment;

- it greatly improves references for English language, it is slightly
  better for latin languages, and worse for non-latin ones;

- it does not guarantee link stability during export;

- it introduces a regression wrt custom ID.

Notwithstanding the problem of "puny.el", the regression makes it not
suitable as a drop-in replacement for random `org-export-get-reference'
yet. With more work, it can become an interesting evolution of
`org-export-get-reference', however. Since this regression does not
affect HTML export back-ends, it could be used there meanwhile.

Link stability is still an issue, even if the proposal gives a false
sense of security in that area. I don't think we can solve it without
creating a cache for export, where you store all previous references for
a given file. Even this is not sufficient, because you can export
buffers not attached to files.

Regards,
-- 
Nicolas Goaziou



reply via email to

[Prev in Thread] Current Thread [Next in Thread]