emacs-elpa-diffs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[elpa] externals/doc-toc 23e1fb2fde 54/84: Implement HandyOutliner optio


From: ELPA Syncer
Subject: [elpa] externals/doc-toc 23e1fb2fde 54/84: Implement HandyOutliner option
Date: Mon, 26 Sep 2022 13:58:38 -0400 (EDT)

branch: externals/doc-toc
commit 23e1fb2fde508724d1f9c94b5bf6db6b1f3de51b
Author: Daniel Nicolai <dalanicolai@gmail.com>
Commit: Daniel Nicolai <dalanicolai@gmail.com>

    Implement HandyOutliner option
---
 README.org  | 23 +++++++++++-----
 toc-mode.el | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 99 insertions(+), 12 deletions(-)

diff --git a/README.org b/README.org
index 86a38c2d93..cb31fac9ea 100644
--- a/README.org
+++ b/README.org
@@ -106,15 +106,27 @@ Type =C-c C-c= when done.
 
 ** 4. TOC-mode (add outline to document)
 The text of this buffer should have the right structure for adding the contents
-to (for pdf's a copy of) the original document. Final adjusments can be done 
but
+to (for pdf's a copy of) the original document. Final adjustments can be done 
but
 should not be necessary. Type =C-c C-c= for adding the contents to the
 document. 
 
-By default, the TOC is simply added to the original file. ONLY FOR PDF's, if 
the
-(customizable) variable ~toc-replace-original-file~ is ~nil~, then the TOC is 
added
+By default, the TOC is simply added to the original file. (ONLY FOR PDF's, if 
the
+(customizable) variable 
[[help:toc-replace-original-file][toc-replace-original-file]] is ~nil~, then 
the TOC is added
 to a copy of the original pdf file with the path as defined by the variable
 ~toc-destination-file-name~. Either a relative path to the original file
-directory or an absolute path can be given.
+directory or an absolute path can be given.)
+
+Sometimes the =pdfoutline/djvused= application is not able to add the TOC to 
the
+document. In that case you can either debug the problem by copying the used
+terminal command from the =*messages*= buffer and run it manually in the
+document's folder, or you can delete the outline source buffer and run
+=toc--tablist-to-handyoutliner= from the tablist buffer to get an outline 
source
+file that can be used with 
[[http://handyoutlinerfo.sourceforge.net/][HandyOutliner]] (unfortunately the 
handyoutliner
+command does not take arguments, but if you customize the 
[[help:toc-handyoutliner-path][toc-handyoutliner-path]]
+and [[help:toc-file-browser-command][toc-file-browser-command]] variables, 
then Emacs will try to open
+HandyOutliner and the file browser so that you can drag the files directly into
+HandyOutliner).
+
 
 
 * Key bindings
@@ -148,6 +160,3 @@ For adding TOC to document (pdf and djvu): 
[[http://handyoutlinerfo.sourceforge.
 # <input type="image" 
src="https://www.paypalobjects.com/en_US/NL/i/btn/btn_donateCC_LG.gif"; 
border="0" name="submit" title="PayPal - The safer, easier way to pay online!" 
alt="Donate with PayPal button" />
 # <img alt="" border="0" src="https://www.paypal.com/en_NL/i/scr/pixel.gif"; 
width="1" height="1" />
 # </form>
-
-
-
diff --git a/toc-mode.el b/toc-mode.el
index ccc61bd6fb..4d6f2f19c1 100644
--- a/toc-mode.el
+++ b/toc-mode.el
@@ -44,16 +44,27 @@
 
 ;; 1. Extraction Open some pdf or djvu file in Emacs (pdf-tools and djvu 
package
 ;; recommended). Find the pagenumbers for the TOC. Then type M-x
-;; toc-extract-pages, or M-x toc-extract-pages-ocr if doc has no text layer or
-;; text layer is bad, and answer the subsequent prompts by entering the
+;; `toc-extract-pages', or M-x `toc-extract-pages-ocr' if doc has no text layer
+;; or text layer is bad, and answer the subsequent prompts by entering the
 ;; pagenumbers for the first and the last page each followed by RET. For PDF
 ;; extraction with OCR, currently it is required to view all contents pages 
once
 ;; before extraction (toc-mode uses the cached file data). Also the languages
 ;; used for tesseract OCR can be customized via the `toc-ocr-languages'
 ;; variable. A buffer with the, somewhat cleaned up, extracted text will open 
in
 ;; TOC-cleanup mode. Prefix command with the universal argument (C-u) to omit
-;; clean and get the raw text. 2. TOC-Cleanup In this mode you can further
-;; cleanup the contents to create a list where each line has the structure:
+;; clean and get the raw text. If the extracted text is of too low quality you
+;; either can hack/extend the `toc-extract-pages-ocr' definition, or
+;; alternatively you can try to extract the text with the python
+;; document-contents-extractor script (see URL
+;; `https://pypi.org/project/document-contents-extractor/'), which is more
+;; configurable (you are also welcome to hack and improve that script).
+
+;; The documentation at URL
+;; `https://tesseract-ocr.github.io/tessdoc/Command-Line-Usage.html' might be
+;; useful.
+
+;; 2. TOC-Cleanup In this mode you can further cleanup the contents to create a
+;; list where each line has the structure:
 
 ;; TITLE (SOME) PAGENUMBER
 
@@ -114,7 +125,20 @@
 ;; added to a copy of the original pdf file with the path as defined by the
 ;; variable toc-destination-file-name. Either a relative path to the original
 ;; file directory or an absolute path can be given.
-;;; Code:
+
+;; Sometimes the `pdfoutline/djvused' application is not able to add the TOC to
+;; the document. In that case you can either debug the problem by copying the
+;; used terminal command from the `*messages*' buffer and run it manually in 
the
+;; document's folder, or you can delete the outline source buffer and run
+;; `toc--tablist-to-handyoutliner' from the tablist buffer to get an outline
+;; source file that can be used with HandyOutliner (see URL
+;; `http://handyoutlinerfo.sourceforge.net/') Unfortunately the handyoutliner
+;; command does not take arguments, but if you customize the
+;; `toc-handyoutliner-path' and `toc-file-browser-command' variables, then 
Emacs
+;; will try to open HandyOutliner and the file browser so that you can drag the
+;; files directly into HandyOutliner).
+
+;; Finally, if you just want to extract some text
 
 ;; Keybindings
 ;; all-modes (i.e. all steps)
@@ -132,10 +156,12 @@
 ;;  ~C-down/C-up~      scroll document other window (if document buffer shown)
 ;;  ~S-down/S-up~      full page scroll document other window ( idem )
 
+;;; Code:
 (require 'pdf-tools nil t)
 (require 'djvu nil t)
 (require 'evil nil t)
 
+;; List of declarations to eliminate byte-compile errors
 (defvar djvu-doc-image)
 (defvar doc-buffer)
 
@@ -153,6 +179,7 @@
 (declare-function evil-scroll-page-down "evil-commands")
 (declare-function evil-scroll-page-up "evil-commands")
 
+;;;; Customize definitions
 (defgroup toc nil
   "Setting for the toc-mode package"
   :group 'data)
@@ -176,6 +203,21 @@ by tesseract -l flag, e.g. eng or eng+nld. Use
 available languages."
   :type 'string
   :group 'toc)
+
+(defcustom toc-handyoutliner-path nil
+  "Path to handyoutliner executable.
+String (i.e. surround with double quotes). See
+URL`http://handyoutlinerfo.sourceforge.net/'."
+  :type 'file
+  :group 'toc)
+
+(defcustom toc-file-browser-command nil
+  "Command to open file browser.
+String (i.e. surround with double quotes)."
+  :type 'file
+  :group 'toc)
+
+
 ;;;; toc-extract and cleanup
 
 ;;; toc-cleanup
@@ -690,6 +732,41 @@ to `pdfoutline' shell command."
           ((string= ".djvu" ext) (toc--tablist-to-djvused))
           (t (error "Buffer-source-file does not have pdf or djvu 
extension")))))
 
+(defun toc--open-handy-outliner ()
+  (interactive)
+  (start-process ""
+                 nil
+                 toc-handyoutliner-path)
+  (let ((process-connection-type nil))
+    (start-process ""
+                   nil
+                   toc-file-browser-command
+                   (url-file-directory (buffer-file-name)))))
+
+;;; pdf parse tablist to
+(defun toc--tablist-to-handyoutliner ()
+  "Parse and prepare tablist-mode-buffer to source input.
+Displays results in a newlycreated buffer for use as source input
+to `pdfoutline' shell command."
+  (interactive)
+  (goto-char (point-min))
+  (let ((source-buffer (when (boundp 'doc-buffer) doc-buffer))
+        text)
+    (while (not (eobp))
+      (let* ((v (tabulated-list-get-entry))
+             (tabs (make-string (string-to-number (aref v 0)) ?\t)))
+        (setq text (concat text (format "%s %s %s\n" tabs (aref v 1) (aref v 
2))))
+        (forward-line 1)))
+    (switch-to-buffer (find-file "contents.txt"))
+    (erase-buffer)
+    (toc-mode)
+    (when source-buffer
+      (setq-local doc-buffer source-buffer))
+    (insert text))
+  (save-buffer)
+  (when (and toc-handyoutliner-path toc-file-browser-command)
+    (toc--open-handy-outliner)))
+
 
 ;;;; add outline to document
 (defun toc--add-to-pdf ()
@@ -729,6 +806,7 @@ The text of the current buffer is passed as source input to 
either the
     (cond ((string= ".pdf" ext) (toc--add-to-pdf))
           ((string= ".djvu" ext) (toc--add-to-djvu)))))
 
+
 (provide 'toc-mode)
 
 ;;; toc-mode.el ends here



reply via email to

[Prev in Thread] Current Thread [Next in Thread]