emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] Citations, continued


From: Nicolas Goaziou
Subject: Re: [O] Citations, continued
Date: Fri, 06 Feb 2015 11:27:15 +0100

Richard Lawrence <address@hidden> writes:

Thanks for this reverse engineering.

> Specifically I think we need the following categories, all of which
> would be objects:
>   - key
>   - prefix / pre-text
>   - suffix / post-text
>   - locator
>   - individual citation
>   - bracketed citation
>   - unbracketed citation
>
> These should have a grammar like the following, based on my
> (reverse-engineered) understanding of the Pandoc syntax for citations:
>
>   - A bracketed citation is a list of one or more individual citations, 
>     separated by ';' if there are two or more, and surrounded by '[' ']'
>   - An individual citation is formatted like: PREFIX KEY LOCATOR SUFFIX
>     The key is obligatory, and the prefix, locator and suffix
>     are optional.
>   - A key optionally begins with '-', and obligatorily contains '@'
>     followed by a string of charcters which begins with a letter or '_',
>     and may contain alphanumeric characters and the following internal
>     punctuation characters:
>        :.#$%&-+?<>~/
>   - A prefix or suffix is a text object (that may contain markup like
>     emphasis or macros)
>   - An unbracketed citation consists of a key, optionally followed by a
>     locator which is enclosed in '[' ']'

I don't think all should be objects. For example, prefix and suffix can
be properties in a `full-citation' object (like :tag in items).

IIUC, we need three objects (I'm not wedded to the names):

  - short-citation (aka unbracketed citation), with :cite-key
    and :locator properties, both being strings and :suppress-author as
    a boolean ;
    
  - full-citation (aka individual citation), with, in addition to the
    properties above, :prefix and :suffix, both being parsed string.

Since full citations can only exist in a bracketed citation, there is no
reason to create a third object type for the latter. It acts as a mere
container only useful for lexer.

> I am not sure about the syntax of locators.  In particular, I do not
> know if they should allow internal markup, I do not know if they have an
> internal syntax, and I do not know if a comma is required to separate
> them from a key in a bracketed citation.

This needs to be decided indeed. Is there any reason to allow markup
there?

My only concern is speed. A bracketed citation can induce a lot of
backtracking since it can be triggered each time a square bracket is
opened, which is not too uncommon, I think. Basically, at each "[", we
need to find corresponding "]", and if there is, any key between the
two. That's some overhead.

Also, syntax is ambiguous. For example, in

  [[http://orgmode.org][some @key]]

it is not clear if @key should be treated as a short-citation in a link
description, or included in a full citation with
"[http://orgmode.org][some " as its prefix. I mean, the answer is clear
for you and me, but not necessarily at lexer's level. For example,
Eric's parser chose the former, which is good, but also disallows square
brackets in prefix, which rules out some objects from this location
(mainly links and footnotes).

That's why I suggested the [cite: ...] part in the first place, which
you dismissed quickly. It reduces backtracking a lot and can solve
easily some confusing situations.

Of course I understand the need for compatibility with existing Pandoc
syntax, but I wouldn't want us to shoot ourselves in the foot. Even if
we don't use "cite:" markup, I think we should carefully specify current
syntax to avoid loopholes.


Regards,

-- 
Nicolas Goaziou



reply via email to

[Prev in Thread] Current Thread [Next in Thread]