bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#35419: [Proposal] Buffer Lenses and the Case of Org-Mode (also, Jupy


From: Dmitrii Korobeinikov
Subject: bug#35419: [Proposal] Buffer Lenses and the Case of Org-Mode (also, Jupyter)
Date: Thu, 25 Apr 2019 00:35:16 +0600

For your convenience, I have attached this e-mail as an org-mode file.

*SUMMARY* /Buffer lens/ is an object inside a buffer which provides that buffer the lines to display and which is linked to another buffer (from which it can take and process data), while mapping the input back to that linked buffer. /Composite lens/ is the extension of this idea to interface any sources of data (for which text interface is possible). Some Org-mode problems are addressed, particularly those concerning source block editing and viewing, syntax checking, completion and reference expansion.

This is a proposal for /lenses/. Hereby, I outline the general idea, the problems it solves, the features it introduces and its use cases.

* Problem Statement

  The problem is that it's hard to treat some area inside a buffer
  - as if it ran in a different buffer, with its own set of modes,
  - as an object w/ distinct properties and interface.

  This proposal is not drawn out of thin air: the bigger part of it is about applications
  (for an immediate example, you could think of an org mode source block.)

* Mechanics

  The idea is to embed a buffer inside another buffer.
  Have a block of text, a sentence, a table, a word -- some /area/ in your buffer -- behave as if it were in some other distinct buffer, which has it's own modes and keybindings.

  What's proposed here is to do such a trick using a /buffer lens/.

  First, let's form a preliminary, prototype definition (it will be changed later in this section).
  A /buffer lens/ is an object which views its /linked buffer/ and this buffer can be viewed using /areas/.
  An /area/ is a text object, whose properties and contents are controlled by its /lens/.
  The /linked buffer/ is (conceptually) just a regular buffer, whose contents the lens can use to control its /area/.

  Suppose you have opened some buffer - call it a /Working buffer/ or /WB/.
  /Lens-mode/ is the mode responsible for all the /lenses/ inside the /working buffer/.

  The goals of the /lens-mode/ are:
  - Identify, track and handle all the /lenses/ in the buffer.
  - Display each /area/ according to the options of its /lens/ and its own options.
  - Let the user relay the control/input to a /lens/ (say, when the cursor is in its /area/), which would in turn relay the input to its /linked buffer/ as if it were in its own window. Either all user actions could be relayed or just a defined subset (e.g. specific keybindings or commands).
  - Let the user define specific maps and user input handling routines for any specific /lens/ or for any type of /lens/, where type is deduced from the properties of the lens (by a user-defined function).
  - Propagate save commands to the lenses.
    (which may signal the /model/ to adapt the changes in the /shared base/: these will be discussed shortly).

  We could also come up with a general description of a /lens/.
  Call it a /composite lens/.
  
  A /composite lens/ could consist of these parts: /model/, /representation/, /linked buffer/, /shared base/, /shared block/, /view/ and /controller/.
  This is almost like MVC.
  I find it somewhat more intuitive to use the term /area/ instead of /view/, so let's do that (at least for now).

  /Model/ is data (strings, buffers, databases, anything).
  /Representation/ is the description of how to build /shared blocks/ and /shared bases/ using the /model/.
  /Shared base/ consists of /shared blocks/.
  Each /shared block/ is constructed (through representation) to be:
  - either plain text or
  - a lens (such as /buffer-lens/) (hence the name /composite-lens/ - it makes use of other lenses).
  /Linked buffer/ is a buffer which views a /shared base/. This buffer is like a regular buffer (runs its own modes, etc.).
  /Area/ is associated w/ one /linked buffer/ and is its contents + some properties.
  /Controller/ maps the user input to the /linked buffer/ of the /area/. 

  Get a load of this:
  The /linked buffer/ views a /shared base/, which consists of /shared blocks/, which are either plain text or other lenses, being described by the /representation/ of the /model/, which may be ultimately accessed via the /controller/, which maps the user input from the /area/.

  Plain text for /shared blocks/ is for stuff that doesn't need to be kept in the /model/ (say, for delimiters which identify the /lens/ as such in the /working buffer/, where they reside before the /lens-mode/ runs).

  The goal of these abstractions is to allow having
  - multiple /areas/, 
  - each having a dedicated buffer (w/ distinct modes and properties),
    - which share the same textual data (the /shared base/), 
      - which is compiled from plain text or other lenses (/shared blocks/),
        - which have their own input interface (e.g. /buffer-lenses/).

  Since the data is shared between the /areas/ (through /shared base/), if the user makes some changes in one /area/, the changes immediately appear in all other /areas/.

  As such, a /buffer lens/ is a special case of a /composite lens/, except 
  - it has no /model/ (and so, needs no /representation/), 
  - its /shared base/ is a single buffer.
  Note, this approach differs from what was described in the beginning of this section: now there can be multiple /linked buffers/, all sharing one /shared base/.
  This approach is more powerful: it allows multiple /areas/ to behave differently and reduces data duplication.

  One point worth attention is that an /area/ should also have its own set of properties, such as custom padding, alignment (center, left, right), etc.
  Such properties could be arbitrary as long as mapping the user input back to the /linked buffer/ can be performed.
  This will allow visual customizations.

  /Representations/ should be modifiable (through some interface).

  The usage of /shared blocks/ above is really for explanatory purposes and, possibly, less of a necessity (but may indeed come in useful when multiple representations).

  Saving is also an important thing to discuss.
  The lens should be able to form an area, where the displayed text /shadows/ the text which actually needs to be saved.
  This is doable: in org-mode, when you fold a title, the ellipsis have arbitrary data hidden in their place.
  But the possibilities are:
  - use faces/folding or such (I am not too familiar w/ the associated technicalities, but I think they could work), or
  - (what seems like the better solution), the save operation could grab the /shared base/ which the /area/ uses (or query the lens for what to save).

  And as for the undo behavior, what's clear is that the changes will need to be tracked to the /shared base/ of all /areas/ of the same /representation/.

* Practical Applications

** Org-mode

  Org-mode, a distinct planescape in the Emacs multiverse, might be the mode to benefit the most from the use of lenses.

*** 3D Tables

    Suppose you have three table:

    | 0 | 0 | 0 |
    | 0 | A | 0 |
    | 0 | 0 | 0 |

    | 0 | B | 0 |
    | B | 0 | B |
    | 0 | B | 0 |

    | C | 0 | C |
    | 0 | 0 | 0 |
    | C | 0 | C |

    Say, these tables are just the layers of a 3x3 cube.
    You might want to view this cube as a distinct entity:
    
    -**LAYER 1**-
    | 0 | 0 | 0 |
    | 0 | A | 0 |
    | 0 | 0 | 0 |

    You could use a composite lens for this.
    The /model/ would be three buffers, one table in each.
    The /representation/ describes the /view/ as:
    - as a string ("-**LAYER 1**-") and
    - a /buffer lens/ linking one of the three buffers in the /model/.

    One could command the lens to switch to the next layer: just change the /representation/.

    When the cursor is in the /area/, the user input now is handled in its /linked buffer/.
    When the cursor is directly on the table, the input is relayed to the /buffer lens/ of the table, whose /area's/ /linked buffer/ has org-mode running.

    The tables are identified and replaced when the /lens-mode/ runs for the first time.
    But what should happen when the user saves the buffer?
    We want to save all three tables, not just the one which happens to be currently viewed.
    The possible solution, as discussed in [[Mechanics]], could be to /shadow/ the three tables.

    Of course, tables can be explored in any way, not just layer by layer.
    https://en.wikipedia.org/wiki/Online_analytical_processing

    So, to give a perspective on all this: a table becomes an object and the modes that run in the /linked buffer/ form the interface to this object.
    And, really, a table is a table for example purposes, and, obviously, anything else could be in its place.

*** Code Blocks and Results: Editing and Viewing

    Here I will first discuss all the relevant problems and then propose a solution.

**** Problems
***** Editing Source Blocks [0/5]

     Consider this block:

     #+BEGIN_SRC elisp
       (defun step (x l)
         (case (< x l) 0
               t 1))
     #+END_SRC

     First, 'newline-and-indent doesn't indent the code correctly.
     It simply seems to look at the first symbol of the previous line:

     - [ ] support for mode-specific editing features is limited,
     - [ ] the mode-specific interactive features (say, keybindings) are disregarded.

     Using 'org-edit-src-code helps, but it is a hassle.

     What's more, every time 'org-edit-src-code is used:

     - [ ] a new buffer is created, which implies initialization of the buffer modes,
     - [ ] the changes need to be written back.

     The issues with the points above can be demonstrated by observing the bugs w/ 'org-comment-dwim:

     - the screen is scrolled to show the first line of the block,
     - in visual line mode, the cursor jumps to the beginning of the block.

     Marks are thrown back to the beginning of the block -- for the same reason.

     Also, the larger the code block, the more work has to be done, so,

     - [ ] some lag is to be expected.

     (Bug report for the commenting behavior: https:/sdebbugs.gnu.org/cgi/bugreport.cgi?bug=bug%2334977)

***** Viewing Source Blocks and Results [0/6]

     Consider these two blocks:

     #+NAME: syntax-tangling-commenting
     #+BEGIN_SRC elisp
       (defun step (x l)
         (case (< x l) 0
               t 1))
     #+END_SRC

     #+BEGIN_SRC elisp :noweb yes
       <<syntax-tangling-commenting>>
       (step "wrong argument")
     #+END_SRC

     First and foremost, there is no syntax checking and neither is there completion:

     - [ ] what can be easily done in a separate buffer w/ some mode on can't be done inline.

     And using 'org-edit-src-code is not much of a relief as the syntax checker and the completer have 

     - [ ] no way of dealing with references to process code as if tangled in.

     The consequences of this point are farther than those of the missing syntax checking or completion:
     sometimes, looking at code as a coherent whole, and not a series of disconnect snippets, is what the programmer needs.

     In principle, it could be possible for 'org-edit-source-code to substitute code for each reference, but this is problematic due to the points made in [[Editing Source Blocks]] section.

     Next, when some failed assertion or a runtime error tells you a line number, looking for it is tough.
     But there is no way to show line numbers (absolute, as if references were expanded).
     Or suppose a source block produces a huge table.
     This means you will want to truncate lines.
     But maybe that's not what you want for the rest of your buffer.
     These boil down to:

     - [ ] can't view a specific area of a buffer, like :results or a source block, as if it were in another buffer.

     One other nice feature to see would be running the code and seeing the results from within 'org-edit-src-code.
     Currently, if one works using 'org-edit-src-code, running the code is not an option, because you wouldn't see the :RESULTS anyway.
     Currently, switching is necessary from the edit buffer to the main buffer and back.
     This problem could be solved by redirecting the output to some buffer and split screening, but, then again, updating the original buffer is required.

     - [ ] If the results of execution were to be redirected to a separate buffer, they would have to be mapped back to the original file for consistency.

     Let's now discuss the visual properties of code blocks and :results.

     #+NAME: one-liner
     #+BEGIN_SRC elisp
       (defun step (x l) (case (< x l) 0 t 1))
     #+END_SRC

     For two lines of meaningful data, there are four lines of, admittedly, noise.
     Meaningful: the stuff that the programmer concentrates on. 
     For longer snippets, the noise-to-signal ratio drops, but when you have a lot of snippets, these extra lines of parameters add up.
     And everything starts looking like spiders making love in a bowl of spaghetti.

     I think one way which could help is to have a summary line and hide the rest:

     > [one-liner] (elisp) <other stuff>
     >  (defun step (x l) (case (< x l) 0 t 1))

     And when it is necessary to edit the description of the block, one could expand it on a key combination:

     - the appearance of blocks could be more distinct and less noisy.

     To add to this, note how the code in the source blocks is indented two spaces forward.
     Those are hard spaces and they weren't inserted right away, only after using 'org-edit-src-code.
     No doubt, this indentation helps.
     However, it would be better for it to be purely visual and to be there without having to call anything.

     In short:

     - [ ] a powerful way to customize the view of blocks is needed.

     Next, the /ob-async/ package shows that asynchronous execution of blocks is possible.
     Apparently, the way it works is by placing a unique identifier in the RESULTS and then replacing it.
     Another possible scenario: display the current running time in the footer of a block.
     Is the find/search procedure the best one could do?

     - [ ] A unified way to update the view of a block / results section could help.

**** Solution
***** Basis

    A source block can be shown via a /composite lens/.
    And so can be every noweb reference.

    For instance, suppose a buffer (call it /main buffer/) has these blocks (also used in the following subsections):

    #+NAME: block-A
    #+BEGIN_SRC python
       f = lambda x: x ** x
    #+END_SRC

    #+NAME: block-B
    #+BEGIN_SRC python :noweb yes
       <<block-A>>
       g = lambda x: f(x) + f(x + 1)
    #+END_SRC

    So, these lenses are created:
    - /block-A/ /composite lens/ containing:
       - /buffer-A/ /lens/ (w/ contents "f = lambda x: x ** x")
       - begin/end source markers as plain text
    - /block-B/ /composite lens/ containing:
       - /buffer-A/ /lens/ (w/ contents "f = lambda x: x ** x" shadowing "<<block-A>>" or just "<<block-A>>")
       - /buffer-B/ /buffer lens/ (w/ contents "g = lambda x: f(x) + f(x + 1)")

    As you see, the code of block-A appears in two blocks: as code and as a reference.
    A reasonable thing to do is create just one lens for both.
    So, this lens should contain the code, but should also be able to show just the reference.
    There are multiple ways to display this lens (that is, to form an /area/):
    show the code/reference which, /optionally/, shadows reference/code.
    For example, in block-B, we may choose to show code and shadow the reference, so that when the document is saved, the text of the reference is actually saved and not the code.

    So, in the end, there is just one lens for /buffer-A/, with two /areas/, one for /block-B/ and one for /block-A/.
    The lens could be either /buffer-lens/ or a /composite-lens/, it's up to the implementation.

***** Editing

    The most important point to remember now is this: 
    the linked buffer has its own set of modes, independent of the mods in the working buffer.

    Lens-mode offers an important utility (discussed in [[Mechanics]]): 
    it can redirect keybindings and commands when the cursor is in the area of the lens.
    So, basically, the user can have access to all the features of the modes which run in the linked buffer.
    This immediately implies proper indentation and the resolution of the bug w/ commenting.
    (comment keybinding is forwarded to the /inner buffer/ and the right thing is done there.)

    And what about 'org-edit-src-code?
    Just open a buffer and show the /buffer-A/B/ lens there!
    No need to write the changes back: all the areas of the lens update in real time.

***** Viewing

    In addition to the two blocks from [[Basis]], let's define:

    #+NAME: block-C
    #+BEGIN_SRC python :noweb yes
       <<block-B>>
       h = 10 * f(5) * g(2)
    #+END_SRC

    #+NAME: block-D
    #+BEGIN_SRC python :noweb yes
       <<block-B>>
       y = g(1)
    #+END_SRC

    So, the dependency tree is this:
    block-A <-- block-B <-- block-C
                        <-- block-D

    First, let's see what can be done about syntax checking, completion and the possibility of viewing all the code at once.
    How can we deal with the noweb references?

    Understanding what it is exactly that we want may help us answer the question.
    A rough list of requirements/wishes could be:

    (regardless of working in the /main buffer/ or using 'org-edit-src-code)
    - (1) have an option to expand the references into code
    - (2) have syntax checking and completion:
       - (a) which work as if the references were expanded (regardless of the actual reference expansion options),
       - (b) which work as if the code that references the block is seen as well 
         - if included by multiple blocks (e.g. in case of ~block-B~, the usage by ~block-C~ and ~block-D~), prefer the one which ran last.
       - (c) and minimize the work done.

    OK, all of the above can exploit the functionality of lenses, which, upon request, can produce the right /area/ based the preferences of the buffer which contains the /lens/.
    (1) is covered: the lenses for references are recursively asked to show code instead of the reference.
    (2) is also covered - let's discuss the details.

    What is point (2),(b) all about?
    Why would <<block-B>> care about who references it (i.e blocks C and D)?
    Wouldn't it be enough for the syntax checker to view just the expanded <<block-A>> reference?
    Indeed, that would almost work, but there are flaws to this approach:
    - things like /unused variable/ or /missing a closing bracket/ will arise,
    - it is unnecessarily expensive.

    This last point is simple: why run syntax checker in ~block-A~ and ~block-B~, if one could run it in ~block-C~ and map the changes back?
    So, the solution is to build source block tree and run syntax checker on the end nodes, while propagating the results back to the root. If there is a fork, propagate from the one which the user ran last.
    How exactly could this work in our example?
    - Keep a buffer which views the ~block-C~ lens and recursively tell the reference-lenses to show the code. 
    - Run the syntax checker there and associate the output w/ each lens (recursively).
    - For completion, propagate user input (from lenses A, B and C) to this same ~block-C~ buffer and deal w/ the result.

    OK, so much for the tougher share of the problems.
    Now, let's get the rest of the issues.

    - View customization is in place: /areas/ may have various properties and can show/hide/display the begin/end ornamentation in any manner.
      (e.g. :RESULTS lens truncates lines, while the buffer that contains it doesn't, etc., see the [[Problems]] section)
      (e.g. the /area/ of the code lens is indented two spaces, as it's property)

    - One could instruct a lens what to do instead of using regexp + replace.
      (e.g. continuosly update a timer)

    Results sections can be shown via lenses.
    Now, when editing w/ 'ord-edit-src-code, just split the screen and open :RESULTS in another buffer.
    Editing or running the code: everything will be kept in sync w/ the original buffer.
    Some blocks may output several distinct pieces of data, like here:
    http://kitchingroup.cheme.cmu.edu/blog/2017/01/29/ob-ipython-and-inline-figures-in-org-mode
    Replacing the old results could be just a matter of removing the existing lens and placing in a new one.
    No need to search for the end of the results block.

    You could select a portion of the block and narrow it.

    Showing the line numbers should be a matter of running nlinum in the end node buffer (as w/ syntax checking), but I don't know how nlinum works to say for sure.

    To be fair, some of the descriptions here depend on implementation possibilities, so the level of detail is in accordance.

*** Other uses

**** Inline view of links

     One could make some sort of a lens to view a part of a buffer/file identified by a link.
     The buffer could be the same buffer, an external file, anything.
     One would not have to follow the link.
     When following the link is too cumbersome, copy pasting is prevented.

**** Display Org documents

     Sections in an org document could be shown using lenses.
     Not that I know why this would be useful (well, maybe for making a [[Jupyter]] like environment).
     But the concept is interesting.

**** Connect several source blocks

     (I have proposed this on the Org-mode mailing list, but now that I think of it, a user-defined way as here is better.)

     A quite ordinary workflow is to write a block and then test the block seperately (or use :epilogue for one-liners).
     However, writing out a test seperately is noisy and something like this could be done instead:

     #+NAME: square-1
     #+BEGIN_SRC python
       square = lambda x: x * x
     #+END_SRC
       return [square(x) for x in range(10)]
     #+END_SRC_TEST

     This could be accomplished by lenses: lens-mode would only need to check for two adjacent blocks <name> and <name>-test and then wrap them in another lens, showing both like above.

     Of course, some way to let org-babel know what's going on would be necessary.

** Arbitrary Positioning (Window Lenses)

   Window lenses are a logical extension of buffer lenses.
   Imagine you wanted to stack two images horizontally.
   Viewing two images horizontally in Emacs can be done by creating two windows, one image in each window.
   That's what a window lens could do, except it would be shown inside a buffer.

** Interactive Graphs

   This one is tougher than window lenses, but much more interesting.
   Say, you wanted to have an interactive graph in your buffer (gnuplot, matplotlib).
   This part of the buffer would require it's own keybindings and all, but the area of the lens could be customized, through its properties of padding/centering and such, to adorn and place the graph in a visually satisfying manner.

   In the land where unicorns live, one would be able to view any graphical window inside Emacs: something that would satisfy the taste of even the finest gourmets of modern text editing (uGhrHrhmph).

   Not that the buffer lenses are the biggest problem here, but they just might come in useful.

** Fast Timer Updates (Text as an Object)

   I have already touched on this topic a bit in the context of Org-Mode, and now, for the sake of completeness, let's form a general characteristic of using lenses.

   A timer was mentioned: currently, as I understand, if you wanted to put it in your buffer and see it continuously update, you would have to use search and replace, which is not efficient.
   The solution offered was to use a lens: it would itself update the timer (as a /shared block/ or some other direct manner).

   And the uses are more general: since /lens-mode/ tracks all the lenses, one could send commands to them.
   They can update independently.
   So, if there are text objects, they may be accessed easily.
   This was one of the goals set in the [[Problem Statement]].

** Bug Trackers & Websites & Databases (Interface Building)

   Perhaps, one could work with Gitlab server or similar through lenses.

   Building an interface for a website doesn't seem to be out of the question either.

   Accesing a database: why not?

* Jupyter

  Jupyter is a reproducible research environment.
  https://jupyter.org/

  Here is what using it looks like:
  http://arogozhnikov.github.io/2016/09/10/jupyter-features.html

  Jupyter has a serious flaw.
  It doesn't run inside Emacs.
  (Listen, all is even worse: it runs in a web browser (yes, I know about links and eww.))

  I haven't used it much, so let's rely on the opinion of someone who has:
  https://towardsdatascience.com/5-reasons-why-jupyter-notebooks-suck-4dc201e27086
  - 1. It is almost impossible to enable good code versioning (files are JSON, merging is difficult)
  - 2. There is no IDE integration, no linting, no code-style correction 
       (quite surprising considering the popularity, isn't it?)
  - 3. Very hard to test (not the problem of Jupyter, really)
  - 4. The non-linear workflow of jupyter
  - 5. Jupyter is bad for running long asynchronous tasks

  Someone also said that many snippets in Jupyter are hard to manage. I believe him. 

  The good stuff about Jupyter:
  - 1. easy to use / low entry barrier,
  - 2. interactive graphs,
  - 3. visually distinct cells.

  Plan: Emacs beats Jupyter, Jupyter dies an agonizing death, everyone starts using Emacs (the last two points may be reversed).

  Say, if the stuff in the org-mode section and the interactive graphs (at least for matplotlib) were implemented, making some special easy mode w/ the intuitive keybindings could convert some Jupyter folks.

  The third point on cells is also doable if lenses get some area customizations like padding and centering. And, if necessary, the whole org tree could be viewed through lenses.

  And org-mode can already export to html, for presentation purposes.

  In short: I think making a good reproducible research environment, easily usable and full of features is a good goal and could lure some people into using Emacs (if only superficially at first).

* Implementation

  I am not familiar with Emacs internals to say what's feasible of the proposed structure.
  
  And the two major things in [[Mechanics]] that somewhat depend on how Emacs works are:
  - shadowing, and
  - making two buffers (w/ different modes) share the same text (/linked buffers/ share the same /shared base/).
  
* Discussion

  Nothing in this proposal is set in stone, feel free to change it to your liking.

  Some naming is probably flawed.
  Some structures might be overly ambitious or unnecessary.
  All implementation-related suggestions are just suggestions.

  Some explanations could be more clear.

  More applications couldn't hurt.

  The effects of the implementation are contained within the lens-mode.
  Which means users should not notice the difference unless they turn on the lens-mode.
  If the implementation is undertaken, this will be good for testing and integration.

  Overall, I think the applications might be very well worth the effort.
  Especially for Org-mode.

Attachment: buffer-lenses.org
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]