emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [HELP] Fwd: Org format as a new standard source format for GNU manua


From: Max Nikulin
Subject: Re: [HELP] Fwd: Org format as a new standard source format for GNU manuals
Date: Tue, 4 Oct 2022 23:32:47 +0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0

It seems I completely failed trying to express my idea.

Instead of extending Org grammar (syntax), I suggest to change behavior of source blocks during export. In addition to current :results options, "ast" may be added. Its effect is that instead of adding text to export buffer that is parsed as Org markup, it causes insertion of a branch of syntax tree into original parse results. I admit, during export it may be necessary to iterate over source blocks one more time at a later stage.

Such source blocks should return "Org Syntax Tree", a simplified variant of org-element. It allows to change implementation details and e.g. to use vectors instead of lists for attributes in org-element. A converter from Org Syntax Tree to org-element should be implemented.

Certainly such format may be used directly as src_ost{(code (:class var) "language")} inline snippets or as

#+begin_src ost
  (code nil ("libtree-{sitter}-"
             (code (:class var) "\"language\"")
             "."
             (code (:class var) "ext")))
#+end_src

block-level elements. However I expect that it is the last resort option when there is no way to express desired construct in some other way.

I think, more convenient org-babel backends may be created to parse TeX-like (texinfo-like) or SGML-like (XML-like) syntax into Org Syntax Tree hierarchy. The essential idea is that outside of source blocks usual lightweight markup is used. Source blocks however have just a few special characters ([\{}], [@{}], or [&<>], etc.) to reduce issues with escaping for regular text or verbatim-like commands.

Some comments are inline.

On 03/10/2022 11:36, Ihor Radchenko wrote:
Max Nikulin writes:

On 02/10/2022 11:59, Ihor Radchenko wrote:

If you are asking how to represent such construct without introducing
custom elements then (it may be e.g. :type, not :class) parsed AST
should be like

      (code nil ("libtree-{sitter}-"
                 (code (:class var) "\"language\"")
                 "."
                 (code (:class var) "ext")))

This is not much different from @name[nil]{<contents>} idea, but
more verbose.

> Also, more importantly, I strongly dislike the need to wrap the text
> into "". You will have to escape \". And it will force third-party
> parsers to re-implement Elisp sexp reader.

By this example I was trying to show how to express @var, @samp, @file without introducing of new custom objects. I do not see any problem with verbosity of such format, it may be used for really special cases only, while some more convenient markup is used for more simple cases.

If there was some syntax for object attributes then simple cases would
be like

      [[attr:(:class var)]]~language~

I do not like this idea. It will require non-trivial changes in Org
parser and fontification.

Using dedicated object properties or at least inheriting properties from
:parent is the style we employ more commonly across the code:

@var{language}
or
@code[:class var]{language}
or
@attr[:class var]{~language~}

I do not mind to have some "span" object to assign attributes to its direct children. I used link-like prefix object just because a proof of concept may be tried with no changes in Org. It does not require support of nested objects. There is no existing syntax for such "span" objects, but perhaps it is not necessary and source blocks should be used instead for special needs.

I have no idea concerning particular markup that can be used inside
source blocks. It might be LaTeX-like commands as discussed in the
sibling subthread or HTML (XML) based syntax that is more verbose than
TeX-like notation.

      By convention, the dynamic library
      for src_alt{\code[class=var]{language}} is
src_alt{\code{libtree-\{sitter\}-\code[class=var]{"language"}.\code[class=var]{ext}}},
      where src_alt{\code[class=var]{ext}} is the
      system-specific extension for dynamic libraries.

I am against the idea of LaTeX-like commands. It will clash with
latex-fragment object type.
https://orgmode.org/worg/dev/org-syntax.html#LaTeX_Fragments

or

      By convention, the dynamic library for
      src_alt{<code class="var">language</code>} is
      src_alt{<code>libtree-{sitter}-<code
class="var">"language"</code>.<code class="var">ext</code></code>},
      where src_alt{<code class="var">ext</code>} is the
      system-specific extension for dynamic libraries.

This style will indeed make things easier for the parser. But I find it
too verbose for practical usage. This is why I instead proposed the idea
with variable number of brackets: @code{{can have } inside}}.

Texinfo is TeX with \ replaced by @. Just another character has the category starting command. The important point is that while Org markup uses a lot of special characters (*/_+[]...) this flexible markup should use just a few ones. I do not see any obstacles to try texinfo-like markup. Source blocks allow to have several languages.

Hypothetical "alt" babel language has default :results ast :export
results header arguments to inject AST bypassing Org markup stage.

The problem with src block emitting AST is clashing with the way src
blocks work during export. What `org-export-as' does is replacing/adding
src block output into the actual Org buffer text before the parsing is
done.

Handling direct AST sexps will require a rewrite on how babel
integration with export works.

Yes, it will. I am evaluating feasibility of such change instead of extending of Org syntax for custom elements.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]