>From 44507a06d5213ca986bd901c50fb96915a208ef4 Mon Sep 17 00:00:00 2001 From: Thierry Banel Date: Mon, 17 Aug 2015 14:45:50 +0200 Subject: [PATCH] Add table joining feature * org-tbljoin.el (orgtbl-join), (orgtbl-to-joined-table), (org-insert-dblock:join), (org-dblock-write:join): the new joining engine * org.el, org-table.el: key and menu bindings * test-org-tbljoin.el, org-tbljoin.org: unit tests * org.texi: document feature under the "Joining ables" entry in the "Tables" section. --- doc/org.texi | 290 +++++++++++++++++++++++++ lisp/org-table.el | 3 +- lisp/org-tbljoin.el | 450 +++++++++++++++++++++++++++++++++++++++ lisp/org.el | 3 +- testing/examples/org-tbljoin.org | 251 ++++++++++++++++++++++ testing/lisp/test-org-tbljoin.el | 233 ++++++++++++++++++++ 6 files changed, 1228 insertions(+), 2 deletions(-) create mode 100644 lisp/org-tbljoin.el create mode 100644 testing/examples/org-tbljoin.org create mode 100644 testing/lisp/test-org-tbljoin.el diff --git a/doc/org.texi b/doc/org.texi index 0f5747d..4259ccb 100644 --- a/doc/org.texi +++ b/doc/org.texi @@ -374,6 +374,7 @@ Tables * Orgtbl mode:: The table editor as minor mode * The spreadsheet:: The table editor has spreadsheet capabilities * Org-Plot:: Plotting from org tables +* Joining tables:: Adding material from a table to another The spreadsheet @@ -2095,6 +2096,7 @@ calculations are supported using the Emacs @file{calc} package * Orgtbl mode:: The table editor as minor mode * The spreadsheet:: The table editor has spreadsheet capabilities * Org-Plot:: Plotting from org tables +* Joining tables:: Adding material from a table to another @end menu @node Built-in table editor @@ -3402,6 +3404,294 @@ The formula is an elisp call: @end table address@hidden Joining tables address@hidden Joining tables address@hidden joining two tables + +One table (the master table) is grown by selectively appending rows of +another table (the reference table). As an example, here is a list of +products for a cooking recipe. + address@hidden +| type | quty | +|----------+------| +| onion | 70 | +| tomatoe | 120 | +| eggplant | 300 | +| tofu | 100 | address@hidden verbatim + +We want to complete it with nutritional facts: quantities of fiber, +sugar, proteins, and carbohydrates. For this purpose, we have a long +reference table of standard products. (This table has been freely +borrowed from Nut-Nutrition, @uref{http://nut.sourceforge.net/}, by Jim +Jozwiak). + address@hidden +#+tblname: nut +| type | Fiber | Sugar | Protein | Carb | +|----------+-------+-------+---------+------| +| eggplant | 2.5 | 3.2 | 0.8 | 8.6 | +| tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | +| onion | 1.3 | 4.4 | 1.3 | 9.0 | +| egg | 0 | 18.3 | 31.9 | 18.3 | +| rice | 0.2 | 0 | 1.5 | 16.0 | +| bread | 0.7 | 0.7 | 3.3 | 16.0 | +| orange | 3.1 | 11.9 | 1.3 | 17.6 | +| banana | 2.1 | 9.9 | 0.9 | 18.5 | +| tofu | 0.7 | 0.5 | 6.6 | 1.4 | +| nut | 2.6 | 1.3 | 4.9 | 7.2 | +| corn | 4.7 | 1.8 | 2.8 | 21.3 | address@hidden verbatim + +Let us put the cursor on the "type" column of the recipe table, and type address@hidden C-x j} or @code{M-x orgtbl-join}. A few questions are asked. Then +the recipe gets new columns appended with the needed nutrition facts: + address@hidden +| type | quty | Fiber | Sugar | Protein | Carb | +|----------+------+-------+-------+---------+------| +| onion | 70 | 1.3 | 4.4 | 1.3 | 9.0 | +| tomatoe | 120 | 0.6 | 2.1 | 0.8 | 3.4 | +| eggplant | 300 | 2.5 | 3.2 | 0.8 | 8.6 | +| tofu | 100 | 0.7 | 0.5 | 6.6 | 1.4 | address@hidden verbatim + +If you are familiar with SQL, you would get a similar result with the +a @emph{join} (actually a @emph{left outer join}). + address@hidden +select * +from recipe, nut +left outer join nut on recipe.type = nut.type; address@hidden example + address@hidden +* In-place push pull:: Enriching a table or deriving an enriched table +* Block parameters:: Specifying tables and columns to use +* Duplicates or missing values:: Accept non-perfect fit +* Keeping headers:: Keeping table headers address@hidden menu + address@hidden In-place push pull address@hidden In-place push pull address@hidden in-place, push, pull + +Three modes are available: @emph{in-place}, @emph{push}, @emph{pull}. + address@hidden In-place mode + +The master table is changed (in-place) by appending columns from the +reference table. + address@hidden @kbd + address@hidden C-x j,orgtbl-join} +The cursor must be positioned on the column used to perform the join. User +is prompted for the reference table and the column to use. Then material is +added from the reference table into the table under the cursor. Also +available from the menu @kbd{Tbl > Column > Join with a reference table}. + address@hidden table + address@hidden Push mode + +The master table drives the creation of derived tables. Specify the wanted +result in @code{#+ORGTBL: SEND} directives (as many as desired): + address@hidden +#+ORGTBL: SEND enriched orgtbl-to-joined-table :ref-table nut :mas-column type :ref-column type +| type | quty | +|----------+------| +| onion | 70 | +| tomatoe | 120 | +| eggplant | 300 | +| tofu | 100 | address@hidden verbatim + +The receiving blocks must be created somewhere else in the same file: + address@hidden +#+BEGIN RECEIVE ORGTBL enriched +#+END RECEIVE ORGTBL enriched address@hidden verbatim + +Typing @kbd{C-c C-c} with the cursor on the first pipe of the master table +refreshes all derived tables. + address@hidden Pull mode + +So-called "dynamic blocks" may also be used. The resulting table knows how +to build itself. Here is an example of a master table which is unaware that +it will be enriched in a joined table: + address@hidden +#+TBLNAME: recipe +| type | quty | +|----------+------| +| onion | 70 | +| tomatoe | 120 | +| eggplant | 300 | +| tofu | 100 | address@hidden verbatim + +Create somewhere else a @emph{dynamic block} which carries the specification of +the join: + address@hidden +#+BEGIN: join :mas-table recipe :mas-column type :ref-table nut :ref-column type +| type | quty | type | Fiber | Sugar | Protein | Carb | +|----------+------+----------+-------+-------+---------+------| +| onion | 70 | onion | 1.3 | 4.4 | 1.3 | 9.0 | +| tomatoe | 120 | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | +| eggplant | 300 | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | +| tofu | 100 | tofu | 0.7 | 0.5 | 6.6 | 1.4 | +#+END: address@hidden verbatim + +Typing @kbd{C-c C-c} with the cursor on the @code{#+BEGIN:} line refreshes the +table. + +A wizard is available for the full process. Type @kbd{M-x +org-insert-dblock:join} and answer the questions. + address@hidden As a rule of thumb + +For quick and once-only processing, use @emph{in-place} mode. + +Use @emph{pull} or @emph{push} modes for reproductible work. The @emph{pull} +mode might be easier to use than the @emph{push}, because the address@hidden:join} wizard is available. Other than that, the two +modes use the same underlying engine, so using one or the other is just a +matter or convenience. + address@hidden Block parameters address@hidden Block parameters address@hidden block + +Table creation is driven by bloc lines: @code{#+ORGTBL: SEND} in pull mode +and @code{#+BEGIN} in pull mode. The form is as follow: + address@hidden +#+ORGTBL: SEND ENRICHED orgtbl-to-joined-table :ref-table REFTABLE :mas-column COLUMN :ref-column COLUMN +#+BEGIN: join :mas-table MASTABLE :mas-column COLUMN :ref-table REFTABLE :ref-column COLUMN address@hidden verbatim + +Parameters meaning is as follow: + address@hidden @code + address@hidden SEND ENRICHED + + The name of the derived table, which must match a receiving block: address@hidden +#+BEGIN RECEIVE ORGTBL ENRICHED +#+END RECEIVE ORGTBL ENRICHED address@hidden verbatim + address@hidden orgtbl-to-joined-table + + This is the name of the lisp function in charge of creating the derived + table. Do not change this name. + address@hidden :mas-table MASTABLE + + Gives the name of the master table. This is the table which will be + enriched with material from the reference table. + address@hidden :ref-table REFTABLE + + Gives the name of the reference table. This is the table from which + material will be borrowed. + address@hidden :mas-column COLUMN + + Specifies the column in the master table to match a corresponding column in + the reference table. @code{COLUMN} may be a dollar form as @code{$1} + (first column), @code{$2} (second column), and so on, or the name of the + column if the table has a header. + address@hidden :ref-column COLUMN + + Specifies the column in the reference table to match a corresponding column + in the master table. @code{COLUMN} may be a dollar form as @code{$1} + (first column), @code{$2} (second column), and so on, or the name of the + column if the table has a header. + address@hidden table + + address@hidden Duplicates or missing values address@hidden Duplicates or missing values address@hidden duplicate values address@hidden missing values + +It may happen that no row in the reference table matches a value in +the master table. In this case, the master row is kept, with empty +cells added to it. Information from the master table is never lost. +If, for example, a line in the recipe refers to an unkown "amarante" +product (a cereal known by the ancient Incas), then the resulting +table will still contain the "amarante" row, with empty nutritional +facts. + address@hidden +| type | quty | type | Fiber | Sugar | Protein | Carb | +|----------+------+----------+-------+-------+---------+------| +| onion | 70 | onion | 1.3 | 4.4 | 1.3 | 9.0 | +| tomatoe | 120 | tomatoe | 0.6 | 2.1 | 0.8 | 3.4 | +| eggplant | 300 | eggplant | 2.5 | 3.2 | 0.8 | 8.6 | +| tofu | 100 | tofu | 0.7 | 0.5 | 6.6 | 1.4 | +| amarante | 120 | | | | | | address@hidden verbatim + +The reference table may contain several matching rows for the same +value in the master table. In this case, as many rows are created in +the joined table. Therefore, the resulting table may be longer than +the master table. Example, if the reference table contains three rows +for "eggplants": + address@hidden +#+tblname: nut +| type | Cooking | Fiber | Sugar | Protein | Carb | +|----------+---------+-------+-------+---------+------| +| ... | ... | ... | ... | ... | ... | +| eggplant | boiled | 2.5 | 3.2 | 0.8 | 8.6 | +| eggplant | pickled | 3.4 | 6.5 | 1.2 | 13.3 | +| eggplant | raw | 2.8 | 1.9 | 0.8 | 4.7 | +| ... | ... | ... | ... | ... | ... | address@hidden verbatim + +Then the resulting table will have those three rows appended: + address@hidden +| type | quty | type | Cooking | Fiber | Sugar | Protein | Carb | +|----------+------+----------+---------+-------+-------+---------+------| +| eggplant | 300 | eggplant | boiled | 2.5 | 3.2 | 0.8 | 8.6 | +| eggplant | 300 | eggplant | pickled | 3.4 | 6.5 | 1.2 | 13.3 | +| eggplant | 300 | eggplant | raw | 2.8 | 1.9 | 0.8 | 4.7 | address@hidden verbatim + +If you are familiar with SQL, this behavior is reminicent of the address@hidden outer join}. + address@hidden Keeping headers address@hidden Keeping headers address@hidden header + +The master and the reference tables may or may not have a header. When +there is a header, it may extend over several lines. A header ends +with an horizontal line. + +The join system tries to preserve as much of the master table as possible. +Therefore, if the master table has a header, the joined table will have it +verbatim, over as many lines as needed. + +The reference table header (if any), will fill-in the header (if any) +of the resulting table. But if there is no room in the resulting +table header, the reference table header lines will be ignored, partly +of fully. + + @node Hyperlinks @chapter Hyperlinks @cindex hyperlinks diff --git a/lisp/org-table.el b/lisp/org-table.el index b6d59f1..1ed0804 100644 --- a/lisp/org-table.el +++ b/lisp/org-table.el @@ -4396,7 +4396,8 @@ to execute outside of tables." ["Move Column Left" org-metaleft :active (org-at-table-p) :keys "M-"] ["Move Column Right" org-metaright :active (org-at-table-p) :keys "M-"] ["Delete Column" org-shiftmetaleft :active (org-at-table-p) :keys "M-S-"] - ["Insert Column" org-shiftmetaright :active (org-at-table-p) :keys "M-S-"]) + ["Insert Column" org-shiftmetaright :active (org-at-table-p) :keys "M-S-"] + ["Join with a reference table" orgtbl-join :active (org-at-table-p) :keys "C-c C-x j"]) ("Row" ["Move Row Up" org-metaup :active (org-at-table-p) :keys "M-"] ["Move Row Down" org-metadown :active (org-at-table-p) :keys "M-"] diff --git a/lisp/org-tbljoin.el b/lisp/org-tbljoin.el new file mode 100644 index 0000000..271edd7 --- /dev/null +++ b/lisp/org-tbljoin.el @@ -0,0 +1,450 @@ +;;; orgtbl-join.el --- join columns from another table + +;; Copyright (C) 2014-2015 Free Software Foundation, Inc. + +;; Author: Thierry Banel tbanelwebmin at free dot fr +;; Keywords: org, table, join, filtering + +;; This file is part of GNU Emacs. + +;; GNU Emacs is free software: you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation, either version 3 of the License, or +;; (at your option) any later version. + +;; GNU Emacs is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GNU Emacs. If not, see . + +;;; Commentary: +;; +;; A master table is enriched with columns coming from a reference +;; table. For enriching a row of the master table, matching rows from +;; the reference table are selected. The matching succeeds when the +;; key cells of the master row and the reference row are equal. + +;;; Requires: +(require 'org-table) + +;;; Code: + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; Utility functions + +(defun orgtbl--join-colname-to-int (col table) + "Convert the column name into an integer (first column is numbered 0) +COL may be: +- a dollar form, like $5 which is converted to 4 +- a number, like 5 which is converted to 4 +- an alphanumeric name which appears in the column header (if any) +When COL does not match any actual column, an error is generated. +TABLE is an Org mode table passed as a list of lists of cells. +It is used to check COL against TABLE header." + ;; skip first hlines if any + (while (not (listp (car table))) + (setq table (cdr table))) + (if (symbolp col) + (setq col (symbol-name col))) + (cond ((numberp col) + t) + ((string-match "^\\$?\\([0-9]+\\)$" col) + (setq col (string-to-number (match-string 1 col)))) + (t + ;; TABLE has no header, COL does not make sense + (unless (memq 'hline table) + (user-error "No header on the table, and no such column '%s'" col)) + ;; iterate over first line of header to find COL + (let ((i 0) + (n)) + (mapc (lambda (c) + (setq i (1+ i)) + (if (equal col c) + (setq n i))) + (car table)) + (unless n (user-error "No such column '%s'" col)) + (setq col n)))) + (setq col (1- col)) + (if (or (< col 0) (>= col (length (car table)))) + (user-error "Column %s outside table" col)) + col) + +(defun orgtbl--join-query-column (prompt table) + "Interactively query a column. +PROMPT is displayed to the user to explain what answer is expected. +TABLE is the org mode table from which a column will be choosen +by the user. Its header is used for column names completion. If +TABLE has no header, completion is done on generic column names: +$1, $2..." + (while (eq 'hline (car table)) + (setq table (cdr table))) + (org-icompleting-read + prompt + (if (memq 'hline table) ;; table has a header + (car table) + (let ((i 0)) + (mapcar (lambda (x) (format "$%s" (setq i (1+ i)))) + (car table)))))) + +(defun orgtbl--join-convert-to-hashtable (table col) + "Convert an Org-mode TABLE into a hash table. +The purpose is to provide fast lookup to TABLE's rows. The COL +column contains the keys for the hashtable entries. Return a +cons, the car contains the header, the cdr contains the +hashtable." + ;; skip heading horinzontal lines if any + (while (eq (car table) 'hline) + (setq table (cdr table))) + ;; split header and body + (let ((head) + (body (memq 'hline table)) + (hash (make-hash-table :test 'equal :size (+ 20 (length table))))) + (if (not body) + (setq body table) + (setq head table) + ;; terminate header with nil + (let ((h head)) + (while (not (eq (cadr h) 'hline)) + (setq h (cdr h))) + (setcdr h nil))) + ;; fill-in the hashtable + (mapc (lambda (row) + (when (listp row) + (let ((key (nth col row))) + (puthash key (nconc (gethash key hash) (list row)) hash)))) + body) + (cons head hash))) + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; The following functions are borrowed from the orgtbl-aggregate package. +;; They are general enough to be moved to org-table.el + +(defun orgtbl-list-local-tables () + "Search for available tables in the current file." + (interactive) + (let ((tables)) + (save-excursion + (save-restriction + (widen) + (goto-char (point-min)) + (while (re-search-forward "^[ \t]*#\\+\\(tbl\\)?name:[ \t]*\\(.*\\)" nil t) + (let ((text (match-string 2))) + (set-text-properties 0 (length text) () text) + (setq tables (cons text tables)))))) + tables)) + +(defun orgtbl-get-distant-table (name-or-id) + "Find a table in the current buffer named NAME-OR-ID. +Returns it as a list of lists of cells. An horizontal line is +translated as the special symbol `hline'." + (unless (stringp name-or-id) + (setq name-or-id (format "%s" name-or-id))) + (let (buffer loc id-loc tbeg form) + (save-excursion + (save-restriction + (widen) + (save-excursion + (goto-char (point-min)) + (if (re-search-forward + (concat "^[ \t]*#\\+\\(tbl\\)?name:[ \t]*" + (regexp-quote name-or-id) + "[ \t]*$") + nil t) + (setq buffer (current-buffer) loc (match-beginning 0)) + (setq id-loc (org-id-find name-or-id 'marker)) + (unless (and id-loc (markerp id-loc)) + (error "Can't find remote table \"%s\"" name-or-id)) + (setq buffer (marker-buffer id-loc) + loc (marker-position id-loc)) + (move-marker id-loc nil))) + (with-current-buffer buffer + (save-excursion + (save-restriction + (widen) + (goto-char loc) + (forward-char 1) + (unless (and (re-search-forward "^\\(\\*+ \\)\\|[ \t]*|" nil t) + (not (match-beginning 1))) + (user-error "Cannot find a table at NAME or ID %s" name-or-id)) + (setq tbeg (point-at-bol)) + (org-table-to-lisp)))))))) + +(defun orgtbl-insert-elisp-table (table) + "Insert TABLE in current buffer at point. +TABLE is a list of lists of cells. The list may contain the +special symbol 'hline to mean an horizontal line." + (while table + (let ((row (car table))) + (setq table (cdr table)) + (cond ((consp row) + (insert "|") + (insert (mapconcat #'identity row "|"))) + ((eq row 'hline) + (insert "|-")) + (t (error "Bad row in elisp table"))) + (insert "\n"))) + (delete-char -1) + (org-table-align)) + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; In-place mode + +;;;###autoload +(defun orgtbl-join () + "Add material from a reference table to the current table. +Rows from the reference table are appended to rows of the current +table. For each row of the current table, matching rows from the +reference table are searched and appended. The matching is +performed by testing for equality of cells in the current column, +and a joining column in the reference table. If a row in the +current table matches several rows in the reference table, then +the current row is duplicated and each copy is appended with a +different reference row. If no matching row is found in the +reference table, then the current row is kept, with empty cells +appended to it." + (interactive) + (org-table-check-inside-data-field) + (let* ((col (1- (org-table-current-column))) + (tbl (org-table-to-lisp)) + (ref (orgtbl-get-distant-table + (org-icompleting-read + "Reference table: " + (orgtbl-list-local-tables)))) + (dcol (orgtbl--join-colname-to-int + (orgtbl--join-query-column "Reference column: " ref) + ref)) + (refhead) + (refhash)) + (setq ref (orgtbl--join-convert-to-hashtable ref dcol) + refhead (car ref) + refhash (cdr ref)) + (goto-char (org-table-begin)) + ;; Skip any hline a the top of tbl. + (while (eq (car tbl) 'hline) + (setq tbl (cdr tbl)) + (forward-line 1)) + ;; is there a header on tbl ? append the ref header (if any) + (when (memq 'hline tbl) + ;; for each line of header in tbl, add a header from ref + ;; if ref-header empties too fast, continue with nils + ;; if tbl-header empties too fast, ignore remaining ref-headers + (while (listp (pop tbl)) + (end-of-line) + (when refhead + (orgtbl--join-insert-ref-row (car refhead) dcol) + (setq refhead (cdr refhead))) + (forward-line 1)) + (forward-line 1)) + ;; now the body of the tbl + (mapc (lambda (masline) + (if (listp masline) + (let ((done)) + ;; if several ref-lines match, all of them are considered + (mapc (lambda (refline) + (end-of-line) + (when done ;; make a copy of the current row + (open-line 1) + (forward-line 1) + (insert "|") + (mapc (lambda (y) (insert y) (insert "|")) + masline)) + (orgtbl--join-insert-ref-row refline dcol) + (setq done t)) + (gethash (nth col masline) refhash)))) + (forward-line 1)) + tbl)) + (forward-line -1) + (org-table-align)) + +(defun orgtbl--join-insert-ref-row (row dcol) + "Insert a distant ROW in the buffer. +The DCOL columns (joining column) is skipped." + (let ((i 0)) + (while row + (unless (equal i dcol) + (insert (car row)) + (insert "|")) + (setq i (1+ i)) + (setq row (cdr row))))) + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; PULL & PUSH engine + +(defun orgtbl--join-append-mas-ref-row (masrow refrow refcol) + "Concatenate master and reference rows, skiping the reference column. +MASROW is a list of cells from the master table. REFROW is a +list of cells from the reference table. REFCOL is the position, +numbered from zero, of the column in REFROW that should not be +appended in the result, because it is already present in MASROW." + (let ((result (reverse masrow)) + (i 0)) + (while refrow + (unless (equal i refcol) + (setq result (cons (car refrow) result))) + (setq refrow (cdr refrow)) + (setq i (1+ i))) + (reverse result))) + +(defun orgtbl--create-table-joined (mastable mascol reftable refcol) + "Join a master table with a reference table. +MASTABLE is the master table, as a list of lists of cells. +MASCOL is the name of the joining column in the master table. +REFTABLE is the reference table. +REFCOL is the name of the joining column in the reference table. +Returns MASTABLE enriched with material from REFTABLE." + (let ((result) ;; result built in reverse order + (refhead) + (refhash)) + ;; skip any hline a the top of both tables + (while (eq (car mastable) 'hline) + (setq result (cons 'hline result)) + (setq mastable (cdr mastable))) + (while (eq (car reftable) 'hline) + (setq reftable (cdr reftable))) + ;; convert column-names to numbers + (setq mascol (orgtbl--join-colname-to-int mascol mastable)) + (setq refcol (orgtbl--join-colname-to-int refcol reftable)) + ;; convert reference table into fast-lookup hashtable + (setq reftable (orgtbl--join-convert-to-hashtable reftable refcol) + refhead (car reftable) + refhash (cdr reftable)) + ;; iterate over master table header if any + ;; and join it with reference table header if any + (if (memq 'hline mastable) + (while (listp (car mastable)) + (setq result + (cons (orgtbl--join-append-mas-ref-row + (car mastable) + (and refhead (car refhead)) + refcol) + result)) + (setq mastable (cdr mastable)) + (if refhead + (setq refhead (cdr refhead))))) + ;; create the joined table + (mapc (lambda (masline) + (if (not (listp masline)) + (setq result (cons masline result)) + (let ((result0 result)) + ;; if several ref-lines match, all of them are considered + (mapc (lambda (refline) + (setq result + (cons + (orgtbl--join-append-mas-ref-row + masline + refline + refcol) + result))) + (gethash (nth mascol masline) refhash)) + ;; if no ref-line matches, add the non-matching master-line anyway + (if (eq result result0) + (setq result (cons masline result)))))) + mastable) + (nreverse result))) + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; PUSH mode + +;;;###autoload +(defun orgtbl-to-joined-table (table params) + "Enrich the master TABLE with lines from a reference table. + +PARAMS contains pairs of key-value with the following keys: + +:ref-table the reference table. + Lines from the reference table will be added to the + master table. + +:mas-column the master joining column. + This column names one of the master table columns. + +:ref-column the reference joining column. + This column names one of the reference table columns. + +Columns names are either found in the header of the table, if the +table have a header, or a dollar form: $1, $2, and so on. + +The destination must be specified somewhere in the +same file with a bloc like this: +#+BEGIN RECEIVE ORGTBL destination_table_name +#+END RECEIVE ORGTBL destination_table_name" + (interactive) + (orgtbl-to-generic + (orgtbl--create-table-joined + table + (plist-get params :mas-column) + (orgtbl-get-distant-table (plist-get params :ref-table)) + (plist-get params :ref-column)) + (org-combine-plists + (list :sep "|" :hline "|-" :lstart "|" :lend "|") + params))) + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; PULL mode + +;;;###autoload +(defun org-insert-dblock:join () + "Wizard to interactively insert a joined table as a dynamic block." + (interactive) + (let* ((localtables (orgtbl-list-local-tables)) + (mastable + (org-icompleting-read + "Master table: " + localtables)) + (mascol + (orgtbl--join-query-column + "Master joining column: " + (orgtbl-get-distant-table mastable))) + (reftable + (org-icompleting-read + "Reference table: " + localtables)) + (refcol + (orgtbl--join-query-column + "Reference joining column: " + (orgtbl-get-distant-table reftable)))) + (org-create-dblock + (list :name "join" + :mas-table mastable :mas-column mascol + :ref-table reftable :ref-column refcol)) + (org-update-dblock))) + +;;;###autoload +(defun org-dblock-write:join (params) + "Create a joined table out of a master and a reference table. + +PARAMS contains pairs of key-value with the following keys: + +:mas-table the master table. + This table will be copied and enriched with material + from the reference table. + +:ref-table the reference table. + Lines from the reference table will be added to the + master table. + +:mas-column the master joining column. + This column names one of the master table columns. + +:ref-column the reference joining column. + This column names one of the reference table columns. + +Columns names are either found in the header of the table, if the +table have a header, or a dollar form: $1, $2, and so on. + +The +#+BEGIN RECEIVE ORGTBL destination_table_name +#+END RECEIVE ORGTBL destination_table_name" + (interactive) + (orgtbl-insert-elisp-table + (orgtbl--create-table-joined + (orgtbl-get-distant-table (plist-get params :mas-table)) + (plist-get params :mas-column) + (orgtbl-get-distant-table (plist-get params :ref-table)) + (plist-get params :ref-column)))) + +(provide 'org-tbljoin) +;;; org-tbljoin.el ends here diff --git a/lisp/org.el b/lisp/org.el index 9336183..0dea3fa 100755 --- a/lisp/org.el +++ b/lisp/org.el @@ -21643,7 +21643,8 @@ on context. See the individual commands for more information." ["Move Column Left" org-metaleft (org-at-table-p)] ["Move Column Right" org-metaright (org-at-table-p)] ["Delete Column" org-shiftmetaleft (org-at-table-p)] - ["Insert Column" org-shiftmetaright (org-at-table-p)]) + ["Insert Column" org-shiftmetaright (org-at-table-p)] + ["Join with a reference table" orgtbl-join :active (org-at-table-p) :keys "C-c C-x j"]) ("Row" ["Move Row Up" org-metaup (org-at-table-p)] ["Move Row Down" org-metadown (org-at-table-p)] diff --git a/testing/examples/org-tbljoin.org b/testing/examples/org-tbljoin.org new file mode 100644 index 0000000..5ca2333 --- /dev/null +++ b/testing/examples/org-tbljoin.org @@ -0,0 +1,251 @@ +#+Title: a collection of examples org-tbljoin tests + +* In-place mode + +** Nutritional reference table with header + +There are multiple "eggplant" entries on purpose. +They will all get added to the master table when joining. + +The header extends on 3 lines. All 3 lines will be concatenated to the +master table header, provided the master table header has at least 3 +lines. The excess lines will be ignored. + +#+tblname: nut_with_header +|------+----------+-------+-------+---------| +| Carb | type | Fiber | Sugar | Protein | +| ohyd | | | | | +| rate | | | | | +|------+----------+-------+-------+---------| +| 8.6 | eggplant | 2.5 | 3.2 | 0.8 | +| 8.7 | eggplant | 2.6 | 3.3 | 0.9 | +| 3.4 | tomatoe | 0.6 | 2.1 | 0.8 | +| 9.0 | onion | 1.3 | 4.4 | 1.3 | +| 18.3 | egg | 0 | 18.3 | 31.9 | +| 16.0 | rice | 0.2 | 0 | 1.5 | +| 16.0 | bread | 0.7 | 0.7 | 3.3 | +| 17.6 | orange | 3.1 | 11.9 | 1.3 | +| 18.5 | banana | 2.1 | 9.9 | 0.9 | +| 1.4 | tofu | 0.7 | 0.5 | 6.6 | +| 7.2 | nut | 2.6 | 1.3 | 4.9 | +| 21.3 | corn | 4.7 | 1.8 | 2.8 | +| 8.5 | eggplant | ? | ? | ? | +| | | | | | + +** Nutritional reference table without header + +#+tblname: nut_no_header +| 8.6 | eggplant | 2.5 | 3.2 | 0.8 | +| 8.7 | eggplant | 2.6 | 3.3 | 0.9 | +| 3.4 | tomatoe | 0.6 | 2.1 | 0.8 | +| 9.0 | onion | 1.3 | 4.4 | 1.3 | +| 18.3 | egg | 0 | 18.3 | 31.9 | +| 16.0 | rice | 0.2 | 0 | 1.5 | +| 16.0 | bread | 0.7 | 0.7 | 3.3 | +| 17.6 | orange | 3.1 | 11.9 | 1.3 | +| 18.5 | banana | 2.1 | 9.9 | 0.9 | +| 1.4 | tofu | 0.7 | 0.5 | 6.6 | +| 7.2 | nut | 2.6 | 1.3 | 4.9 | +| 21.3 | corn | 4.7 | 1.8 | 2.8 | +| 8.5 | eggplant | ? | ? | ? | +| | | | | | + +** With a header and a formula + :PROPERTIES: + :ID: cc039f82-24d2-422c-a5ae-4dea09cce684 + :END: + +- Put the cursor on the "type" column +- type + : M-x orgtbl-join +- answer + : nut_with_header + : type + +| quty | type | units | mul | +|------+----------+-------+------| +| 70 | onion | 5 | 350 | +| 120 | tomatoe | 8 | 960 | +| 300 | eggplant | 2 | 600 | +|------+----------+-------+------| +| 100 | tofu | 1 | 100 | +| 250 | corn | 15 | 3750 | +| 90 | tomatoe | 5 | 450 | +|------+----------+-------+------| +| 80 | amarante | 1 | 80 | +#+TBLFM: $4=$1*$3 + +The master tables have a formula on the last column, which will be +preserved after joining. + +** Without a header, with a formula +- Put the cursor on the second column +- type + : M-x orgtbl-join +- answer + : nut_with_header + : type + +| 70 | onion | 5 | 350 | +| 120 | tomatoe | 8 | 960 | +| 300 | eggplant | 2 | 600 | +| 100 | tofu | 1 | 100 | +| 250 | corn | 15 | 3750 | +| 90 | tomatoe | 5 | 450 | +| 80 | amarante | 1 | 80 | +#+TBLFM: $4=$1*$3 + +The master tables have a formula on the last column, which will be +preserved after joining. + +* PULL mode + +** Master table with oversized header + +#+tblname: meal_with_header +| product | quty | +| common | in | +| name | gramms | +| (english) | | +|-----------+--------| +| onion | 70 | +| unknown | 999 | +| tomatoe | 120 | +| eggplant | 300 | +| corn | 250 | + +** Master table without header + +#+tblname: meal_no_header +| onion | 70 | +| not known | 999 | +| tomatoe | 120 | +| eggplant | 300 | +| corn | 250 | + +** Join header+header +Marker: a14723d3-13c8-4fd1-a69f-caf2fdb2d2b1 +#+BEGIN: join :mas-table meal_with_header :mas-column $1 :ref-table nut_with_header :ref-column 2 +#+END: + +** join header+bare +Marker: 79a90117-fc0e-4556-b790-c925b6acd450 +#+BEGIN: join :mas-table "meal_with_header" :mas-column "product" :ref-table "nut_no_header" :ref-column "2" +#+END: + +** join bare+header +Marker: 24c5a7b4-2815-40d4-89ec-32b58f492b32 +#+BEGIN: join :mas-table meal_no_header :mas-column $1 :ref-table nut_with_header :ref-column type +#+END: + +** join bare+bare +Marker: e872df5c-dd5c-4ad6-b395-2e5a000488b5 +#+BEGIN: join :mas-table meal_no_header :mas-column 1 :ref-table nut_no_header :ref-column $2 +#+END: + +* PUSH mode + +** Push a master table with header + +1st reference table has a larger header +2nd reference table has no header + +Marker: 6426c948-bbd4-4b25-8b3d-2584b70af4d0 +#+ORGTBL: SEND joined1 orgtbl-to-joined-table :ref-table nut_with_header :mas-column product :ref-column type +#+ORGTBL: SEND joined2 orgtbl-to-joined-table :ref-table "nut_no_header" :mas-column "$1" :ref-column $2 +| product | quty | +| (yes) | (g) | +|---------------+------| +| onion | 70 | +| not specified | 999 | +| tomatoe | 120 | +| eggplant | 300 | +| corn | 250 | + +#+BEGIN RECEIVE ORGTBL joined1 +#+END RECEIVE ORGTBL joined1 + +#+BEGIN RECEIVE ORGTBL joined2 +#+END RECEIVE ORGTBL joined2 + +** Push a master table with not header + +1st reference table has a larger header +2nd reference table has no header + +Marker: 1683a68c-f4df-4b04-9f2d-bd4c8a909bf3 +#+ORGTBL: SEND joined3 orgtbl-to-joined-table :ref-table nut_with_header :mas-column "1" :ref-column type +#+ORGTBL: SEND joined4 orgtbl-to-joined-table :ref-table "nut_no_header" :mas-column $1 :ref-column $2 +| onion | 70 | +| not specified | 999 | +| tomatoe | 120 | +| eggplant | 300 | +| corn | 250 | + +#+BEGIN RECEIVE ORGTBL joined3 +#+END RECEIVE ORGTBL joined3 + +#+BEGIN RECEIVE ORGTBL joined4 +#+END RECEIVE ORGTBL joined4 + +* Cartesian product + +What happens when the master and the reference table are the same +table? A so-called cartesian product (named after the mathematician +René Descartes) is created. Every possible combination of rows is +created. + +** Simple auto-join in pull-mode + +The table is joined with itself, creating a cartesian product. The +resulting table size is the square of the original table size (7*7 = +49). + +#+tblname: auto +| t | n | +|---+---| +| a | 1 | +| a | 2 | +| a | 3 | +| a | 4 | +| a | 5 | +| a | 6 | +| a | 7 | + +Marker: b6e51dab-cded-427e-8967-d14a34070d08 +#+BEGIN: join :mas-table auto :mas-column t :ref-table auto :ref-column "t" +#+END: + +** Two sub-cartesian-products in push mode + +Because the table has two keys (a & b), two completely unrelated +cartesian products are created, each the square size of the source +(3^2 + 2^2 = 13). + +#+tblname: buto +#+ORGTBL: SEND buto2 orgtbl-to-joined-table :ref-table buto :mas-column "t" :ref-column t +| t | n | +|---+---| +| a | 1 | +| a | 2 | +| a | 3 | +| b | 4 | +| b | 5 | + +#+BEGIN RECEIVE ORGTBL buto2 +| t | n | n | +|---+---+---| +| a | 1 | 1 | +| a | 1 | 2 | +| a | 1 | 3 | +| a | 2 | 1 | +| a | 2 | 2 | +| a | 2 | 3 | +| a | 3 | 1 | +| a | 3 | 2 | +| a | 3 | 3 | +| b | 4 | 4 | +| b | 4 | 5 | +| b | 5 | 4 | +| b | 5 | 5 | +#+END RECEIVE ORGTBL buto2 diff --git a/testing/lisp/test-org-tbljoin.el b/testing/lisp/test-org-tbljoin.el new file mode 100644 index 0000000..54cd6df --- /dev/null +++ b/testing/lisp/test-org-tbljoin.el @@ -0,0 +1,233 @@ +;;; test-org-tbljoin.el --- tests for org-tbljoin.el + +;; Copyright (C) 2015 Thierry Banel + +;; This program is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation, either version 3 of the License, or +;; (at your option) any later version. + +;; This program is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with this program. If not, see . + +;;; Code: + +;(unless (featurep 'org-tbljoin) +; (signal 'missing-test-dependency "Table Join")) + +(defun org-tbljoin-test-pull (marker expected) + (org-test-at-marker + "../examples/org-tbljoin.org" + marker + (forward-line 1) + (org-update-dblock) + (let ((result (buffer-substring-no-properties (point) (progn (search-forward "#+END:") (point))))) + (set-buffer-modified-p nil) + (kill-buffer) + (should (equal result expected))))) + +(ert-deftest org-tbljoin/pull/hh () + "Joining two tables." + (org-tbljoin-test-pull + "a14723d3-13c8-4fd1-a69f-caf2fdb2d2b1" + "#+BEGIN: join :mas-table meal_with_header :mas-column $1 :ref-table nut_with_header :ref-column 2 +| product | quty | Carb | Fiber | Sugar | Protein | +| common | in | ohyd | | | | +| name | gramms | rate | | | | +| (english) | | | | | | +|-----------+--------+------+-------+-------+---------| +| onion | 70 | 9.0 | 1.3 | 4.4 | 1.3 | +| unknown | 999 | | | | | +| tomatoe | 120 | 3.4 | 0.6 | 2.1 | 0.8 | +| eggplant | 300 | 8.6 | 2.5 | 3.2 | 0.8 | +| eggplant | 300 | 8.7 | 2.6 | 3.3 | 0.9 | +| eggplant | 300 | 8.5 | ? | ? | ? | +| corn | 250 | 21.3 | 4.7 | 1.8 | 2.8 | +#+END:")) + +(ert-deftest org-tbljoin/pull/hb () + "Joining two tables." + (org-tbljoin-test-pull + "79a90117-fc0e-4556-b790-c925b6acd450" + "#+BEGIN: join :mas-table \"meal_with_header\" :mas-column \"product\" :ref-table \"nut_no_header\" :ref-column \"2\" +| product | quty | | | | | +| common | in | | | | | +| name | gramms | | | | | +| (english) | | | | | | +|-----------+--------+------+-----+-----+-----| +| onion | 70 | 9.0 | 1.3 | 4.4 | 1.3 | +| unknown | 999 | | | | | +| tomatoe | 120 | 3.4 | 0.6 | 2.1 | 0.8 | +| eggplant | 300 | 8.6 | 2.5 | 3.2 | 0.8 | +| eggplant | 300 | 8.7 | 2.6 | 3.3 | 0.9 | +| eggplant | 300 | 8.5 | ? | ? | ? | +| corn | 250 | 21.3 | 4.7 | 1.8 | 2.8 | +#+END:")) + +(ert-deftest org-tbljoin/pull/bh () + "Joining two tables." + (org-tbljoin-test-pull + "24c5a7b4-2815-40d4-89ec-32b58f492b32" + "#+BEGIN: join :mas-table meal_no_header :mas-column $1 :ref-table nut_with_header :ref-column type +| onion | 70 | 9.0 | 1.3 | 4.4 | 1.3 | +| not known | 999 | | | | | +| tomatoe | 120 | 3.4 | 0.6 | 2.1 | 0.8 | +| eggplant | 300 | 8.6 | 2.5 | 3.2 | 0.8 | +| eggplant | 300 | 8.7 | 2.6 | 3.3 | 0.9 | +| eggplant | 300 | 8.5 | ? | ? | ? | +| corn | 250 | 21.3 | 4.7 | 1.8 | 2.8 | +#+END:")) + +(ert-deftest org-tbljoin/pull/bb () + "Joining two tables." + (org-tbljoin-test-pull + "e872df5c-dd5c-4ad6-b395-2e5a000488b5" + "#+BEGIN: join :mas-table meal_no_header :mas-column 1 :ref-table nut_no_header :ref-column $2 +| onion | 70 | 9.0 | 1.3 | 4.4 | 1.3 | +| not known | 999 | | | | | +| tomatoe | 120 | 3.4 | 0.6 | 2.1 | 0.8 | +| eggplant | 300 | 8.6 | 2.5 | 3.2 | 0.8 | +| eggplant | 300 | 8.7 | 2.6 | 3.3 | 0.9 | +| eggplant | 300 | 8.5 | ? | ? | ? | +| corn | 250 | 21.3 | 4.7 | 1.8 | 2.8 | +#+END:")) + +(defun org-tbljoin-test-push (markersrc marker1 marker2 expected1 expected2) + (org-test-at-marker + "../examples/org-tbljoin.org" + markersrc + (beginning-of-line) + (forward-line 3) + (orgtbl-send-table 'maybe) + (let ((result1 + (buffer-substring-no-properties + (progn (search-forward marker1) (forward-char 1) (org-table-align) (point)) + (progn (search-forward "#+END") (point)))) + (result2 + (buffer-substring-no-properties + (progn (search-forward marker2) (forward-char 1) (org-table-align) (point)) + (progn (search-forward "#+END") (point))))) + (set-buffer-modified-p nil) + (kill-buffer) + (message "result1 = %s" result1) + (should + (and + (equal result1 expected1) + (equal result2 expected2)))))) + +(ert-deftest org-tbljoin/push/h () + (org-tbljoin-test-push + "6426c948-bbd4-4b25-8b3d-2584b70af4d0" + "#+BEGIN RECEIVE ORGTBL joined1" + "#+BEGIN RECEIVE ORGTBL joined2" + "| product | quty | Carb | Fiber | Sugar | Protein | +| (yes) | (g) | ohyd | | | | +|---------------+------+------+-------+-------+---------| +| onion | 70 | 9.0 | 1.3 | 4.4 | 1.3 | +| not specified | 999 | | | | | +| tomatoe | 120 | 3.4 | 0.6 | 2.1 | 0.8 | +| eggplant | 300 | 8.6 | 2.5 | 3.2 | 0.8 | +| eggplant | 300 | 8.7 | 2.6 | 3.3 | 0.9 | +| eggplant | 300 | 8.5 | ? | ? | ? | +| corn | 250 | 21.3 | 4.7 | 1.8 | 2.8 | +#+END" + "| product | quty | | | | | +| (yes) | (g) | | | | | +|---------------+------+------+-----+-----+-----| +| onion | 70 | 9.0 | 1.3 | 4.4 | 1.3 | +| not specified | 999 | | | | | +| tomatoe | 120 | 3.4 | 0.6 | 2.1 | 0.8 | +| eggplant | 300 | 8.6 | 2.5 | 3.2 | 0.8 | +| eggplant | 300 | 8.7 | 2.6 | 3.3 | 0.9 | +| eggplant | 300 | 8.5 | ? | ? | ? | +| corn | 250 | 21.3 | 4.7 | 1.8 | 2.8 | +#+END")) + +(ert-deftest org-tbljoin/push/b () + (org-tbljoin-test-push + "1683a68c-f4df-4b04-9f2d-bd4c8a909bf3" + "#+BEGIN RECEIVE ORGTBL joined3" + "#+BEGIN RECEIVE ORGTBL joined4" + "| onion | 70 | 9.0 | 1.3 | 4.4 | 1.3 | +| not specified | 999 | | | | | +| tomatoe | 120 | 3.4 | 0.6 | 2.1 | 0.8 | +| eggplant | 300 | 8.6 | 2.5 | 3.2 | 0.8 | +| eggplant | 300 | 8.7 | 2.6 | 3.3 | 0.9 | +| eggplant | 300 | 8.5 | ? | ? | ? | +| corn | 250 | 21.3 | 4.7 | 1.8 | 2.8 | +#+END" + "| onion | 70 | 9.0 | 1.3 | 4.4 | 1.3 | +| not specified | 999 | | | | | +| tomatoe | 120 | 3.4 | 0.6 | 2.1 | 0.8 | +| eggplant | 300 | 8.6 | 2.5 | 3.2 | 0.8 | +| eggplant | 300 | 8.7 | 2.6 | 3.3 | 0.9 | +| eggplant | 300 | 8.5 | ? | ? | ? | +| corn | 250 | 21.3 | 4.7 | 1.8 | 2.8 | +#+END")) + + +(ert-deftest org-tbljoin/pull/auto () + "Cartesian product of a table with itself." + (org-tbljoin-test-pull + "b6e51dab-cded-427e-8967-d14a34070d08" + "#+BEGIN: join :mas-table auto :mas-column t :ref-table auto :ref-column \"t\" +| t | n | n | +|---+---+---| +| a | 1 | 1 | +| a | 1 | 2 | +| a | 1 | 3 | +| a | 1 | 4 | +| a | 1 | 5 | +| a | 1 | 6 | +| a | 1 | 7 | +| a | 2 | 1 | +| a | 2 | 2 | +| a | 2 | 3 | +| a | 2 | 4 | +| a | 2 | 5 | +| a | 2 | 6 | +| a | 2 | 7 | +| a | 3 | 1 | +| a | 3 | 2 | +| a | 3 | 3 | +| a | 3 | 4 | +| a | 3 | 5 | +| a | 3 | 6 | +| a | 3 | 7 | +| a | 4 | 1 | +| a | 4 | 2 | +| a | 4 | 3 | +| a | 4 | 4 | +| a | 4 | 5 | +| a | 4 | 6 | +| a | 4 | 7 | +| a | 5 | 1 | +| a | 5 | 2 | +| a | 5 | 3 | +| a | 5 | 4 | +| a | 5 | 5 | +| a | 5 | 6 | +| a | 5 | 7 | +| a | 6 | 1 | +| a | 6 | 2 | +| a | 6 | 3 | +| a | 6 | 4 | +| a | 6 | 5 | +| a | 6 | 6 | +| a | 6 | 7 | +| a | 7 | 1 | +| a | 7 | 2 | +| a | 7 | 3 | +| a | 7 | 4 | +| a | 7 | 5 | +| a | 7 | 6 | +| a | 7 | 7 | +#+END:")) + +(provide 'test-org-tbljoin) +;;; test-org-tbljoin.el ends here -- 2.1.4