groff-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[groff] 01/01: preconv: Support Emacs coding tags at file ends.


From: G. Branden Robinson
Subject: [groff] 01/01: preconv: Support Emacs coding tags at file ends.
Date: Wed, 6 May 2020 04:10:33 -0400 (EDT)

gbranden pushed a commit to branch master
in repository groff.

commit 04783d5c7184e4c7c85d0e80b077d521c0ac15ed
Author: G. Branden Robinson <address@hidden>
AuthorDate: Wed May 6 18:00:55 2020 +1000

    preconv: Support Emacs coding tags at file ends.
    
    * src/preproc/preconv/preconv.cpp (get_tag_lines): Rename to...
    
      (get_early_tag_lines): ...this.
    
      (get_late_coding_tag): Add new function.  Search last 3000 bytes {or
      region after last form-feed control} of file for "coding:" within a
      region bracketed by "Local Variables:" and "End:".  Give up on seek,
      read, or memory allocation failures.
    
      (check_coding_tag): Rename to...
    
      (check_early_coding_tag): ...this.  Call newly-named
      get_early_tag_lines().  Update comments.
    
      (check_coding_tag): Add new function.  Try get_late_coding_tag()
      first, then fall back to check_early_coding_tag().
    
      (detect_file_encoding): Alter debugging output so it's easier to grep
      and verify Emacs coding tag detection.
    
    * src/preproc/preconv/preconv.1.man (Bugs): Delete; its sole concern was
      the absence of this feature.
    
      (Usage): Document alterations to algorithm.
    
      (Usage/Coding Tags): Add discussion of "late" (in the file) coding
      tags.  Restyle early tag example.  Stop manipulating adjustment.  Use
      hyphen-minus (\- escape) characters in coding tag names, since they
      are literals that one might paste into an editor window.
    
      Stop referencing XEmacs, whose development is moribund as far as I
      know.
    
      Add "us-ascii" coding tag to page; while not strictly necessary, it
      facilitates testing (see below).
    
    * src/preproc/preconv/tests/late_coding_tags_work.sh: Test.
    
    * src/preproc/preconv/preconv.am: Run test.  Wrap long lines.
---
 ChangeLog                                          |  36 ++++
 NEWS                                               |  12 ++
 src/preproc/preconv/preconv.1.man                  | 232 +++++++++++----------
 src/preproc/preconv/preconv.am                     |  11 +-
 src/preproc/preconv/preconv.cpp                    | 130 +++++++++++-
 src/preproc/preconv/tests/late_coding_tags_work.sh |  44 ++++
 6 files changed, 340 insertions(+), 125 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 93e82fb..8998e7f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,39 @@
+2020-05-06  G. Branden Robinson <address@hidden>
+
+       preconv: Support Emacs local variable lists at ends of files.
+
+       * src/preproc/preconv/preconv.cpp (get_tag_lines): Rename to...
+       (get_early_tag_lines): ...this.
+       (get_late_coding_tag): Add new function.  Search last 3000 bytes
+       {or region after last form-feed control} of file for "coding:"
+       within a region bracketed by "Local Variables:" and "End:".
+       Give up on seek, read, or memory allocation failures.
+       (check_coding_tag): Rename to...
+       (check_early_coding_tag): ...this.  Call newly-named
+       get_early_tag_lines().  Update comments.
+       (check_coding_tag): Add new function.  Try get_late_coding_tag()
+       first, then fall back to check_early_coding_tag().
+       (detect_file_encoding): Alter debugging output so it's easier to
+       grep and verify Emacs coding tag detection.
+
+       * src/preproc/preconv/preconv.1.man (Bugs): Delete; its sole
+       concern was the absence of this feature.
+       (Usage): Document alterations to algorithm.
+       (Usage/Coding Tags): Add discussion of "late" (in the file)
+       coding tags.  Restyle early tag example.  Stop manipulating
+       adjustment.  Use hyphen-minus (\- escape) characters in coding
+       tag names, since they are literals that one might paste into an
+       editor window.
+
+       Stop referencing XEmacs, whose development is moribund as far as
+       I know.
+
+       Add "us-ascii" coding tag to page; while not strictly necessary,
+       it facilitates testing (see below).
+
+       * src/preproc/preconv/tests/late_coding_tags_work.sh: Test.
+       * src/preproc/preconv/preconv.am: Run test.  Wrap long lines.
+
 2020-05-05  G. Branden Robinson <address@hidden>
 
        * src/utils/afmtodit/afmtodit.pl: Format usage message with
diff --git a/NEWS b/NEWS
index db4ddc6..1c0715e 100644
--- a/NEWS
+++ b/NEWS
@@ -86,6 +86,18 @@ o The new option -V emits the constructed groff command that 
nroff would
   prompt; this is a historical deficiency of the Bourne shell family not
   yet corrected by the POSIX standard.
 
+Preconv
+-------
+
+o preconv now supports coding tgs in "late" GNU Emacs file-local
+  variable regions, that is, those which appear at ends of files.  If a
+  valid coding tag is found, one in the "early" style is not consulted.
+  Example:
+    .\" Local Variables:
+    .\" coding: utf-8
+    .\" mode: nroff
+    .\" End:
+
 Macro Packages
 --------------
 
diff --git a/src/preproc/preconv/preconv.1.man 
b/src/preproc/preconv/preconv.1.man
index ecca486..2f4c261 100644
--- a/src/preproc/preconv/preconv.1.man
+++ b/src/preproc/preconv/preconv.1.man
@@ -129,40 +129,72 @@ Print the version number and exit.
 .SH Usage
 .\" ====================================================================
 .
-.B preconv
+.I preconv
 tries to find the input encoding with the following algorithm.
 .
+.
 .IP 1.
 If the input encoding has been explicitly specified with option
 .BR \-e ,
 use it.
 .
+.
 .IP 2.
-Otherwise, check whether the input starts with a
-.I Byte Order Mark
-(BOM, see below).
+Otherwise,
+check whether the input starts with a Unicode Byte Order Mark
+(BOM,
+see below).
 .
 If found, use it.
 .
+.
 .IP 3.
-Otherwise, check whether there is a known
-.I coding tag
-(see below) in either the first or second input line.
+Otherwise,
+check whether there is a recognized Emacs coding tag
+(see below)
+in a file-local variables region at the end of the file.
+.
+If found, use it.
+.
+.
+.IP 4.
+Otherwise,
+check whether there is a recognized Emacs coding tag in either the first
+or second input line.
 .
 If found, use it.
 .
-.IP 4
-Finally, if the
-.B uchardet
-library
-(an encoding detector library available on most major distributions)
-is available on the system, use it to try to detect the encoding of the file.
 .
 .IP 5.
-If everything fails, use a default encoding as given with option
-.BR \-D ,
-by the current locale, or \[oq]latin1\[cq] if the locale is set to
-\[oq]C\[cq], \[oq]POSIX\[cq], or empty (in that order).
+Otherwise,
+if the
+.I uchardet
+library
+(a character-encoding detector library available on most major
+distributions)
+is available on the system,
+use it to try to infer the encoding of the file.
+.
+.
+.IP 6.
+If
+.I uchardet
+fails,
+use the encoding specified by the
+.B \-D
+option.
+.
+.
+.IP 7.
+Use the encoding specified by the current locale
+.RI ( LC_CTYPE ),
+unless the locale is
+\[lq]C\[rq],
+\[lq]POSIX\[rq],
+or empty,
+in which case assume \[lq]Latin-1\[rq]
+(ISO 8859-1)
+as the input file encoding.
 .
 .
 .PP
@@ -209,119 +241,114 @@ space\[cq] character \[en] something not needed 
normally in
 .SS "Coding tags"
 .\" ====================================================================
 .
-Editors which support more than a single character encoding need tags
-within the input files to mark the file's encoding.
+Text editors which support more than a single character encoding need
+tags within the input files to mark the file's encoding.
 .
 While it is possible to guess the right input encoding with the help of
-heuristic algorithms for data which represents a greater amount of a natural
-language, it is still just a guess.
+heuristics which are reliable for a preponderance of natural language
+texts,
+it is still just a guess.
 .
-Additionally, all algorithms fail easily for input which is either too short
-or doesn't represent a natural language.
+Additionally,
+heuristics can fail on inputs that are too short or don't represent a
+natural language.
 .
 .
 .PP
 For these reasons,
-.B preconv
-supports the coding tag convention (with some restrictions) as used by
-.B "GNU Emacs"
-and
-.B XEmacs
-(and probably other programs too).
+.I preconv
+supports the coding tag convention
+(with some restrictions)
+used by GNU\~Emacs.
 .
 .
 .PP
-Coding tags in
-.B "GNU Emacs"
-and
-.B XEmacs
-are stored in so-called
-.IR "File Variables" .
+Coding tags in GNU Emacs are indicated in specially-marked regions of an
+input file designated for \[lq]file-local variables\[rq].
 .
-.B preconv
-recognizes the following syntax form which must be put into a troff comment
-in the first or second line.
+.I preconv
+recognizes two syntax forms which should be put into
+.I roff
+comments.
+.
+The fist must be placed within the last 3,000 bytes of the file,
+and must come after the last
+(if any)
+form-feed control character.
 .
 .RS
-.PP
-\-*\-
-.IR tag1 :
-.IR value1 ;
-.IR tag2 :
-.IR value2 ;
-\&.\|.\|.\& \-*\-
+.EX
+\&.\[rs]" Local Variables:
+\&.\[rs]" coding: \c
+.I encoding
+\&.\[rs]" End:
+.EE
 .RE
 .
 .
 .PP
-The only relevant tag for
-.B preconv
-is \[oq]coding\[cq] which can take the values listed below.
-.
-Here an example line which tells
-.B Emacs
-to edit a file in troff mode, and to use \%latin2 as its encoding.
+The other form must occur within the first two lines of the file.
 .
 .RS
-.PP
 .EX
-\&.\[rs]" \-*\- mode: troff; coding: latin-2 \-*\-
+.B .\[rs]" \-*\- \c
+.RB \&.\|.\|.\& ;\~\c
+.B coding: \c
+.IB encoding ;\~\c
+\&.\|.\|.\& \c
+.B \-*\-
 .EE
 .RE
 .
 .
 .PP
-The following list gives all MIME coding tags (either lowercase or
-uppercase) supported by
-.BR preconv ;
-this list is hard-coded in the source.
+The following list gives all MIME coding tags
+(either lowercase or uppercase)
+supported by
+.IR preconv .
 .
 .RS
-.PP
-.ad l
-\%big5, \%cp1047, \%euc-jp, \%euc-kr, \%gb2312, \%iso-8859-1,
-\%iso-8859-2, \%iso-8859-5, \%iso-8859-7, \%iso-8859-9, \%iso-8859-13,
-\%iso-8859-15, \%koi8-r, \%us-ascii, \%utf-8, \%utf-16, \%utf-16be,
-\%utf-16le
-.ad
+\%big5, \%cp1047, \%euc\-jp, \%euc\-kr, \%gb2312, \%iso\-8859\-1,
+\%iso\-8859\-2, \%iso\-8859\-5, \%iso\-8859\-7, \%iso\-8859\-9,
+\%iso\-8859\-13, \%iso\-8859\-15, \%koi8\-r, \%us\-ascii, \%utf\-8,
+\%utf\-16, \%utf\-16be, \%utf\-16le
 .RE
 .
 .
 .PP
-In addition, the following hard-coded list of other tags is recognized
-which eventually map to values from the list above.
+In addition,
+the following list of other tags is recognized,
+each of which is mapped to an appropriate value from the list above.
 .
 .RS
-.PP
-.ad l
-\%ascii, \%chinese-big5, \%chinese-euc, \%chinese-iso-8bit, \%cn-big5,
-\%\%cn-gb, \%cn-gb-2312, \%cp878, \%csascii, \%csisolatin1,
-\%cyrillic-iso-8bit, \%cyrillic-koi8, \%euc-china, \%euc-cn,
-\%euc-japan, \%euc-japan-1990, \%euc-korea, \%greek-iso-8bit,
-\%iso-10646/utf8, \%iso-10646/utf-8, \%iso-latin-1, \%iso-latin-2,
-\%iso-latin-5, \%iso-latin-7, \%iso-latin-9, \%japanese-euc,
-\%japanese-iso-8bit, \%jis8, \%koi8, \%korean-euc, \%korean-iso-8bit,
-\%latin-0, \%latin1, \%latin-1, \%latin-2, \%latin-5, \%latin-7,
-\%latin-9, \%mule-utf-8, \%mule-utf-16, \%mule-utf-16be,
-\%mule-utf-16-be, \%mule-utf-16be-with-signature, \%mule-utf-16le,
-\%mule-utf-16-le, \%mule-utf-16le-with-signature, \%utf8, \%utf-16-be,
-\%utf-16-be-with-signature, \%utf-16be-with-signature, \%utf-16-le,
-\%utf-16-le-with-signature, \%utf-16le-with-signature
-.ad
+\%ascii, \%chinese\-big5, \%chinese\-euc, \%chinese\-iso\-8bit,
+\%cn\-big5, \%cn\-gb, \%cn\-gb\-2312, \%cp878, \%csascii,
+\%csisolatin1, \%cyrillic\-iso\-8bit, \%cyrillic\-koi8, \%euc\-china,
+\%euc\-cn, \%euc\-japan, \%euc\-japan\-1990, \%euc\-korea,
+\%greek\-iso\-8bit, \%iso\-10646/utf8, \%iso\-10646/utf\-8,
+\%iso\-latin\-1, \%iso\-latin\-2, \%iso\-latin\-5, \%iso\-latin\-7,
+\%iso\-latin\-9, \%japanese\-euc, \%japanese\-iso\-8bit, \%jis8, \%koi8,
+\%korean\-euc, \%korean\-iso\-8bit, \%latin\-0, \%latin1, \%latin\-1,
+\%latin\-2, \%latin\-5, \%latin\-7, \%latin\-9, \%mule\-utf\-8,
+\%mule\-utf\-16, \%mule\-utf\-16be, \%mule\-utf\-16\-be,
+\%mule\-utf\-16be\-with\-signature, \%mule\-utf\-16le,
+\%mule\-utf\-16\-le, \%mule\-utf\-16le\-with\-signature, \%utf8,
+\%utf\-16\-be, \%utf\-16\-be\-with\-signature,
+\%utf\-16be\-with\-signature, \%utf\-16\-le,
+\%utf\-16\-le\-with\-signature, \%utf\-16le\-with\-signature
 .RE
 .
 .
 .PP
-Those tags are taken from
-.B "GNU Emacs"
+Trailing
+\[lq]\-dos\[rq],
+\[lq]\-unix\[rq],
 and
-.BR XEmacs ,
-together with some aliases.
+\[lq]\-mac\[rq]
+suffixes on coding tags
+(which give the end-of-line convention used in the file)
+are disregarded for the purpose of comparison with the above tags.
 .
-Trailing \%\[oq]-dos\[cq], \%\[oq]-unix\[cq], and \%\[oq]-mac\[cq]
-suffixes of coding tags (which give the end-of-line convention used in
-the file) are stripped off before the comparison with the above tags
-happens.
 .
 .\" ====================================================================
 .SS "iconv Issues"
@@ -341,37 +368,18 @@ is used.
 .
 .
 .\" ====================================================================
-.SH Bugs
-.\" ====================================================================
-.
-.B preconv
-doesn't support
-.I "local variable lists"
-yet.
-.
-This is a different syntax form to specify local variables at the end of a
-file.
-.
-.
-.\" ====================================================================
 .SH "See Also"
 .\" ====================================================================
 .
-.BR groff (@MAN1EXT@)
-.br
-the
-.B "GNU Emacs"
-and
-.B XEmacs
-info pages
+.IR groff (@MAN1EXT@)
 .
 .
 .\" Restore compatibility mode (for, e.g., Solaris 10/11).
 .cp \n[*groff_preconv_1_man_C]
 .
 .
-.\" Emacs setting
 .\" Local Variables:
+.\" coding: us-ascii
 .\" mode: nroff
 .\" End:
 .\" vim: set filetype=groff:
diff --git a/src/preproc/preconv/preconv.am b/src/preproc/preconv/preconv.am
index b2599e5..7fd7046 100644
--- a/src/preproc/preconv/preconv.am
+++ b/src/preproc/preconv/preconv.am
@@ -7,8 +7,8 @@
 # Software Foundation, either version 3 of the License, or
 # (at your option) any later version.
 #
-# groff is distributed in the hope that it will be useful, but WITHOUT ANY
-# WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# groff is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 # FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
 # for more details.
 #
@@ -16,12 +16,17 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 bin_PROGRAMS += preconv
-preconv_LDADD = libgroff.a $(LIBM) $(LIBICONV) $(UCHARDET_LIBS) lib/libgnu.a
+preconv_LDADD = libgroff.a $(LIBM) $(LIBICONV) $(UCHARDET_LIBS) \
+  lib/libgnu.a
 preconv_SOURCES = src/preproc/preconv/preconv.cpp
 preconv_CPPFLAGS = $(AM_CPPFLAGS) $(UCHARDET_CFLAGS)
 man1_MANS += src/preproc/preconv/preconv.1
 EXTRA_DIST += src/preproc/preconv/preconv.1.man
 
+preconv_TESTS = \
+  src/preproc/preconv/tests/late_coding_tags_work.sh
+TESTS += $(preconv_TESTS)
+
 
 # Local Variables:
 # fill-column: 72
diff --git a/src/preproc/preconv/preconv.cpp b/src/preproc/preconv/preconv.cpp
index 50f1c42..a6e1b00 100644
--- a/src/preproc/preconv/preconv.cpp
+++ b/src/preproc/preconv/preconv.cpp
@@ -813,8 +813,8 @@ get_BOM(FILE *fp, string &BOM, string &data)
 // or NULL in case no coding tag can occur in the data
 // (which is stored unmodified in 'data').
 // ---------------------------------------------------------
-char *
-get_tag_lines(FILE *fp, string &data)
+static char *
+get_early_tag_lines(FILE *fp, string &data)
 {
   int newline_count = 0;
   int c, prev = -1;
@@ -934,8 +934,111 @@ get_variable_value_pair(char *d1, char **variable, char 
**value)
   return NULL;
 }
 
+// Get coding tag from Emacs local variables list at end of file.
+//
+// The region looks like this:
+//
+// Local Variables:
+// coding: latin-2
+// mode: nroff
+// End:
+//
+// Like Emacs, we search at most 3000 bytes from the end of the file, or
+// from the last form-feed control (^L) that occurs.
+//
+// Our string class doesn't support reverse searches so just use C
+// strings.
+static char *
+get_late_coding_tag(FILE *fp)
+{
+  char *coding_tag = NULL;
+  const int limit = 3000;
+  if (fseek(fp, 0, SEEK_END) != 0)
+    return NULL;
+  // Seek to `limit` bytes from the end of the buffer, or the beginning.
+  if (fseek(fp, -limit, SEEK_END) != 0)
+    if (errno == EINVAL)
+      rewind(fp);
+    else
+      return NULL;
+  char *tmpbuf = (char *) calloc(1, limit + 1 /* trailing '\0' */);
+  if (!tmpbuf) {
+    error("unable to allocate memory");
+    rewind(fp);
+    return NULL;
+  }
+  (void) fread(tmpbuf, 1, limit, fp);
+  if (ferror(fp)) {
+    error("file read error");
+    free(tmpbuf);
+    rewind(fp);
+    return NULL;
+  }
+  char *start = tmpbuf;
+  char *end = tmpbuf + strlen(tmpbuf);
+  char *ff = strrchr(tmpbuf, '\f');
+  if (ff)
+    start = ff;
+  // Find the _last_ occurrence of a local-variables section in the
+  // buffer, because the document might have Emacs file-local variables
+  // as a discussion topic, as our roff(7) man page does.
+  //
+  // strcasestr() is a GNU extension we're not using.  TODO: Gnulib has
+  // it, so we can have it, too.
+  char *lv = NULL, *nextlv = NULL;
+  const char lvstr[] = "Local Variables:";
+  // Declare these now because GCC 8 doesn't like `goto`s crossing them.
+  const char codingstr[] = "coding:";
+  // From here we must 'goto cleanup' to free our buffer and rewind the
+  // file position instead of returning early.
+  lv = strstr(start, lvstr);
+  if (!lv)
+    goto cleanup;
+  else
+    do {
+      start += strlen(lvstr);
+      nextlv = strstr(start, lvstr);
+      if (nextlv) {
+       lv = nextlv;
+       start = lv;
+      }
+    } while(nextlv);
+  end = strstr(start, "End:");
+  if (!end)
+    end = strstr(start, "end:");
+  if (!end)
+    goto cleanup;
+  // Tighten [start, end) bracket until only the coding string remains.
+  // Locate "coding:".
+  start = strstr(start, codingstr);
+  if (!start)
+    goto cleanup;
+  // Move past it.
+  start += strlen(codingstr);
+  // Skip horizontal whitespace.
+  while (strchr(" \t", *start))
+    start++;
+  // Find the next newline and advance the end pointer to it.
+  end = strchr(start, '\n');
+  if (!end)
+    end = strchr(start, '\r');
+  if (!end)
+    goto cleanup;
+  // Back up over any trailing whitespace.
+  do {
+    *end = '\0';
+    end--;
+  } while ((end > start) && strchr(" \t", *end));
+  if (start < end)
+    coding_tag = start;
+cleanup:
+  free(tmpbuf);
+  rewind(fp);
+  return coding_tag;
+}
+
 // ---------------------------------------------------------
-// Check coding tag in the read buffer.
+// Check for coding tag near the beginning of the read buffer.
 //
 // We search for the following line:
 //
@@ -965,13 +1068,11 @@ get_variable_value_pair(char *d1, char **variable, char 
**value)
 // Note that null bytes in the data are skipped before applying
 // the algorithm.  This should work even with files encoded as
 // UTF-16 or UTF-32 (or its siblings) in most cases.
-//
-// XXX Add support for tag at the end of buffer.
 // ---------------------------------------------------------
-char *
-check_coding_tag(FILE *fp, string &data)
+static char *
+check_early_coding_tag(FILE *fp, string &data)
 {
-  char *inbuf = get_tag_lines(fp, data);
+  char *inbuf = get_early_tag_lines(fp, data);
   char *lineend;
   for (char *p = inbuf; is_comment_line(p); p = lineend + 1) {
     if ((lineend = strchr(p, '\n')) == NULL)
@@ -1001,6 +1102,15 @@ check_coding_tag(FILE *fp, string &data)
   return NULL;
 }
 
+static char *
+check_coding_tag(FILE *fp, string &data)
+{
+  char *tag = get_late_coding_tag(fp);
+  if (!tag)
+    tag = check_early_coding_tag(fp, data);
+  return tag;
+}
+
 char *
 detect_file_encoding(FILE *fp)
 {
@@ -1120,7 +1230,7 @@ do_file(const char *filename)
     char *file_encoding = check_coding_tag(fp, data);
     if (!file_encoding) {
       if (debug_flag)
-       fprintf(stderr, "  no encoding tag\n");
+       fprintf(stderr, "  no coding tag\n");
       file_encoding = detect_file_encoding(fp);
       if (!file_encoding) {
         if (debug_flag)
@@ -1132,7 +1242,7 @@ do_file(const char *filename)
     }
     else
       if (debug_flag)
-       fprintf(stderr, "  file encoding: '%s'\n", file_encoding);
+       fprintf(stderr, "  coding tag: '%s'\n", file_encoding);
     encoding = file_encoding;
   }
   strncpy(encoding_string, encoding, MAX_VAR_LEN - 1);
diff --git a/src/preproc/preconv/tests/late_coding_tags_work.sh 
b/src/preproc/preconv/tests/late_coding_tags_work.sh
new file mode 100755
index 0000000..d6020b9
--- /dev/null
+++ b/src/preproc/preconv/tests/late_coding_tags_work.sh
@@ -0,0 +1,44 @@
+#!/bin/sh
+#
+# Copyright (C) 2020 Free Software Foundation, Inc.
+#
+# This file is part of groff.
+#
+# groff is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation, either version 3 of the License, or (at your
+# option) any later version.
+#
+# groff is distributed in the hope that it will be useful, but WITHOUT
+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+# for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+#
+
+# Ensure a predictable character encoding.
+export LC_ALL=C
+
+set -e
+
+preconv="${abs_top_builddir:-.}/preconv"
+
+# We do not find a coding tag on piped input because it isn't seekable.
+echo "testing preconv on document read from pipe" >&2
+"$preconv" -d 2>&1 > /dev/null <<EOF | grep "no coding tag"
+abc
+EOF
+
+# Instead of using temporary files, which in all fastidiousness means
+# cleaning them up even if we're interrupted, which in turn means
+# setting up signal handlers, we use files in the build tree.
+
+doc=contrib/mm/mmroff.1
+echo "testing preconv on Latin-1 document $doc" >&2
+"$preconv" -d 2>&1 > /dev/null $doc | grep "coding tag: 'latin-1'"
+
+doc=src/preproc/preconv/preconv.1
+echo "testing preconv on US-ASCII document $doc" >&2
+"$preconv" -d 2>&1 > /dev/null $doc | grep "coding tag: 'us-ascii'"



reply via email to

[Prev in Thread] Current Thread [Next in Thread]