[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: stability of toc links
From: |
Timothy |
Subject: |
Re: stability of toc links |
Date: |
Mon, 03 May 2021 04:16:06 +0800 |
User-agent: |
mu4e 1.4.15; emacs 28.0.50 |
Nicolas Goaziou <mail@nicolasgoaziou.fr> writes:
> Please note that those short answers did not help me much. So I did my
> homework and looked at your code. I didn't test it thoroughly, so I may
> be missing something.
It's a pity to hear that I wasn't able to suitably clarify things in my
reply. Thank you for being willing to investigate my implementation.
> Now, here's the elephant in the room: "puny.el" was included in Emacs
> 26.1. Org cannot make use of it yet.
Gah.
> Also, the bootstring algorithm, and yours, are very much
> English-centered, as can attest
> `org-reference-contraction-stripped-words'. I insisted on non-latin
> languages for a reason:
>
> (org-reference-contraction "こんにちは") => "28j2a3ar1p-"
>
> or, for a not so long title
>
> (org-reference-contraction "こんにちは コンニチハ") => "v8ttbvbva7si998jvba0bzb0m-"
>
> which is arguably worse than "org1234567".
Mmmm. This isn't great. I preferred the output of Unidecode (ASCII
transliteration) mentioned previously, but that doesn't look like it
could easily be used.
>>> references are guaranteed to be unique in the document;
>>
>> The suffixed number I mentioned ensures this.
>
> Unfortunately, because of them, you cannot guarantee stable links during
> export, much like random references.
>
> For example, if you first export
>
> * Foo
> bar
>
> and if you later modify your document like this
>
> * Foo
> baz
> * Foo
> bar
>
> your link will now point to the "baz" contents instead of "bar".
>
> As a side note, this the reason why I introduced randomness in
> references in the first place. We cannot reference first headline as
> "headline-1", second one as "headline-2", i.e., in a monotonic way,
> because we cannot assume their order is fixed.
>From this I take it you'd rather a broken reference than an incorrect
one? I don't think there's any "good" solution here, just pick your
poison (and, no surprise, I prefer my way).
> More importantly, the above is not limited to headlines with the exact
> same title. Since your algorithm truncates output, this will happen in
> various, less obvious, situations.
While this is technically possible, I think it's worth noting that I
have never seen this in practice, and for reference I have documents
with hundreds of headings (250 in my config, for example).
>>> Also, header content is not stable enough: when you're linking to the
>>> custom ID, you may be able to change the title and yet preserve the
>>> link.
>>
>> Custom IDs still work, so I don't quite see the point here.
>
> How can you be sure?
>
> The point is that in some export back-ends, e.g., ASCII, you will only
> provide a single reference for a headline, i.e., not one for the title
> and another one for the custom ID. If your reference is based solely on
> the title, the reference will break whenever you modify the title
> without touching custom ID. I gave an example in an earlier post
> already. This is a regression wrt the current system.
I remain rather confused on this point. Say I have a document with the
following content:
* Some heading
:PROPERTIES:
:CUSTOM_ID: hey
:END:
See [[#hey]] or [[Some heading]]
In an HTML export I see:
<li><a href="#hey">1. Some heading</a></li>
[...] See <a href="#hey">1</a> or <a href="#hey">1</a></p>
In an ASCII export:
1 Some heading
══════════════
See 1 or 1
In a LaTeX export:
\section{Some heading}
\label{hey}
See \ref{hey} or \ref{hey}
etc.
I don't see how my code affects custom IDs.
> In a nutshell:
>
> - there are very interesting points in your proposal;
Glad you've found some things of interest.
> - it is not applicable at the moment;
I'm guessing this is solely due to punycode?
> - it greatly improves references for English language, it is slightly
> better for latin languages, and worse for non-latin ones;
>
> - it does not guarantee link stability during export;
Indeed. However no approach that doesn't cache every heading with every
export does, and I find this /significantly/ improves stability.
> - it introduces a regression wrt custom ID.
See my confusion above.
> Link stability is still an issue, even if the proposal gives a false
> sense of security in that area. I don't think we can solve it without
> creating a cache for export, where you store all previous references for
> a given file. Even this is not sufficient, because you can export
> buffers not attached to files.
To me this is a case of "don't let the perfect be the enemy of the
good", though I do see that a false sense of security may be
problematic, I consider the benefits to outweigh this.
I hope you've found this reply more useful than my last,
Timothy.