emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#47885: [PATCH] org-table-import: Make it more smarter for interactiv


From: Utkarsh Singh
Subject: bug#47885: [PATCH] org-table-import: Make it more smarter for interactive use
Date: Wed, 28 Apr 2021 14:07:37 +0530
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

Hi,

On 2021-04-27, 22:21 +0200, Nicolas Goaziou <mail@nicolasgoaziou.fr> wrote:

>> + When using org-table-import interactively if we failed to guess
>> separator then we will be left with a user-error message and an
>> 'unconverted table'.  We can make use of 'temp-buffer' to import our
>> file after successfully conversion.
>
> I'm not sure to understand what you mean.

Note: I will advice you to apply patch no. 2 before trying out the
following example.

1. Download the attached CSV file.  We can call this example.csv

2. Go to *scratch* buffer.

3. Use 'M-x org-table-import' to import example.csv as org-table.

You will see even thought org-table-guess-separator failed in guessing
separator we are still left with unconverted region added to our buffer.

>> + Conversion part of org-table-convert-region make a distinction between
>> '(4) (comma separator) and rest of the separator we should either string
>> version of comma as AND condition or rewrite to simplify it.
>
> Ditto. But it can be the object of another patch. Let's concentrate on
> `org-table-guess-separator' first.
>
>> I am willing to do these possible changes but currently waiting for your
>> review for org-table-guess-separator as there can be more serious bugs
>> lurking around on my code which I am considering base for these
>> changes.
>
> You should definitely write tests for this function. Here's a start:
>
>     (ert-deftest test-org-table/guess-separator ()
>       "Test `test-org-table/guess-separator'."
>       ;; Test space separator.
>       (should
>        (equal " "
>               (org-test-with-temp-text "a b\nc d"
>                 (org-table-guess-separator (point-min) (point-max)))))
>       (should
>        (equal " "
>               (org-test-with-temp-text "a b\nc d"
>                 (org-table-guess-separator (point-min) (point-max)))))
>       ;; Test "inverted" region.
>       (should
>        (equal " "
>               (org-test-with-temp-text "a b\nc d"
>                 (org-table-guess-separator (point-max) (point-min)))))
>       ;; Do not error on empty region.
>       (should-not
>        (org-test-with-temp-text ""
>          (org-table-guess-separator (point-max) (point-min))))
>       (should-not
>        (org-test-with-temp-text "   \n"
>          (org-table-guess-separator (point-max) (point-min)))))
>

I will surely do more testing.

I would also like to simplify the condition for guessing SPACE as
separator due to following cases:

+ field1 'this is field2' 'this is field3' :: In this case we still have
SPACE inside quote (' in this case).

+ Since SPACE is our last valid separator I think searching for a line
which doesn't contains space is more than enough.

Required patch:

>From 6b112927de73c43edfd08254217808ebff42772a Mon Sep 17 00:00:00 2001
From: Utkarsh Singh <utkarsh190601@gmail.com>
Date: Wed, 28 Apr 2021 10:26:46 +0530
Subject: [PATCH 1/3] org-table.el (org-table-import): add yes-and-no prompt

Add a yes and no prompt for files which don't have .txt, .tsv OR .csv
as file extensions.
---
 lisp/org/org-table.el | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el
index 0e93fb271f..e0b2be6892 100644
--- a/lisp/org/org-table.el
+++ b/lisp/org/org-table.el
@@ -938,7 +938,8 @@ org-table-import
 - regexp  When a regular expression, use it to match the separator."
   (interactive "f\nP")
   (when (and (called-interactively-p 'any)
-            (not (string-match-p (rx "." (or "txt" "tsv" "csv") eos) file)))
+            (not (string-match-p (rx "." (or "txt" "tsv" "csv") eos) file))
+             (not (yes-or-no-p "File does not have .txt, .tsv or .csv as 
extension.  Do you still want to continue? ")))
     (user-error "Cannot import such file"))
   (unless (bolp) (insert "\n"))
   (let ((beg (point))
-- 
2.31.1

>From 9bb017cfc8284075e04faf5496ed560ba48d5bbc Mon Sep 17 00:00:00 2001
From: Utkarsh Singh <utkarsh190601@gmail.com>
Date: Wed, 28 Apr 2021 10:42:32 +0530
Subject: [PATCH 2/3] org-table.el (org-table-convert-region): move out
 separator-guessing

1. Move separator guessing code to org-table-guess-separator (new
function).
2. Add semicolon, colon and SPACE to the list of know separator
(separator which we can guess).
---
 lisp/org/org-table.el | 49 +++++++++++++++++++++++++++++++++----------
 1 file changed, 38 insertions(+), 11 deletions(-)

diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el
index e0b2be6892..295f7a9b90 100644
--- a/lisp/org/org-table.el
+++ b/lisp/org/org-table.el
@@ -846,6 +846,39 @@ org-table-create
       (goto-char pos))
     (org-table-align)))
 
+(defun org-table-guess-separator (beg0 end0)
+  "Guess separator for region BEG0 to END0.
+
+List of preferred separator (in order of preference):
+comma, TAB, semicolon, colon or SPACE.
+
+Search for a line which doesn't contain a separator if found
+search again using next preferred separator or else return
+separator as string."
+  (let* ((beg (save-excursion
+                (goto-char (min beg0 end0))
+                (skip-chars-forward " \t\n")
+                (if (eobp) (point) (line-beginning-position))))
+        (end (save-excursion
+                (goto-char (max beg0 end0))
+                (skip-chars-backward " \t\n" beg)
+                (if (= beg (point)) (point) (line-end-position))))
+         (sep-regexp
+          (list (list ","  (rx bol (1+ (not (or ?\n ?,))) eol))
+               (list "\t" (rx bol (1+ (not (or ?\n ?\t))) eol))
+               (list ";"  (rx bol (1+ (not (or ?\n ?\;))) eol))
+               (list ":"  (rx bol (1+ (not (or ?\n ?:))) eol))
+               (list " "  (rx bol (1+ (not (or ?\n ?\s))) eol)))))
+    (unless (= beg end)
+      (save-excursion
+        (goto-char beg)
+        (catch :found
+          (pcase-dolist (`(,sep ,regexp) sep-regexp)
+            (save-excursion
+              (unless (re-search-forward regexp end t)
+                (throw :found sep))))
+          nil)))))
+
 ;;;###autoload
 (defun org-table-convert-region (beg0 end0 &optional separator)
   "Convert region to a table.
@@ -862,10 +895,7 @@ org-table-convert-region
 integer  When a number, use that many spaces, or a TAB, as field separator
 regexp   When a regular expression, use it to match the separator
 nil      When nil, the command tries to be smart and figure out the
-         separator in the following way:
-         - when each line contains a TAB, assume TAB-separated material
-         - when each line contains a comma, assume CSV material
-         - else, assume one or more SPACE characters as separator."
+         separator using `org-table-guess-seperator'."
   (interactive "r\nP")
   (let* ((beg (min beg0 end0))
         (end (max beg0 end0))
@@ -882,13 +912,10 @@ org-table-convert-region
       (if (bolp) (backward-char 1) (end-of-line 1))
       (setq end (point-marker))
       ;; Get the right field separator
-      (unless separator
-       (goto-char beg)
-       (setq separator
-             (cond
-              ((not (re-search-forward "^[^\n\t]+$" end t)) '(16))
-              ((not (re-search-forward "^[^\n,]+$" end t)) '(4))
-              (t 1))))
+      (when (and (not separator)
+                 (not (setq separator
+                            (org-table-guess-separator beg end))))
+        (user-error "Failed to guess separator"))
       (goto-char beg)
       (if (equal separator '(4))
          (while (< (point) end)
-- 
2.31.1

>From fef97ffe27ff908647c45f1b066a845e71a0926f Mon Sep 17 00:00:00 2001
From: Utkarsh Singh <utkarsh190601@gmail.com>
Date: Wed, 28 Apr 2021 14:01:31 +0530
Subject: [PATCH 3/3] org-table.el (org-table-import): add file prompt

---
 lisp/org/org-table.el | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lisp/org/org-table.el b/lisp/org/org-table.el
index 295f7a9b90..e904903576 100644
--- a/lisp/org/org-table.el
+++ b/lisp/org/org-table.el
@@ -963,7 +963,8 @@ org-table-import
 - (64)    Prompt for a regular expression as field separator.
 - integer When a number, use that many spaces, or a TAB, as field separator.
 - regexp  When a regular expression, use it to match the separator."
-  (interactive "f\nP")
+  (interactive (list (read-file-name "Import file: ")
+                     (prefix-numeric-value current-prefix-arg)))
   (when (and (called-interactively-p 'any)
             (not (string-match-p (rx "." (or "txt" "tsv" "csv") eos) file))
              (not (yes-or-no-p "File does not have .txt, .tsv or .csv as 
extension.  Do you still want to continue? ")))
-- 
2.31.1

Attachment: example.csv
Description: csv file

-- 
Utkarsh Singh
http://utkarshsingh.xyz

reply via email to

[Prev in Thread] Current Thread [Next in Thread]