bug-standards
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] standards: rewrite section on quoting


From: Paul Eggert
Subject: Re: [PATCH] standards: rewrite section on quoting
Date: Thu, 22 Dec 2011 21:54:31 -0800
User-agent: Mozilla/5.0 (X11; Linux i686; rv:8.0) Gecko/20111124 Thunderbird/8.0

On 12/22/11 17:08, Karl Berry wrote:
> +In the C locale, the output of GNU programs should stick to plain
> +ASCII for quotation characters in messages to users: preferably 0x22
> +(@samp{"}) or 0x27 (@samp{'}) for both opening and closing quotes.
...
> +
> +That's for the output.  In source files, you should stick to ASCII and
> +use the ASCII character 0x60 (@samp{`}) for opening quotes.

Unfortunately I see some problems with this advice.
For example, here's a common bit of code in coreutils:

    fprintf (stderr, _("Try `%s --help' for more information.\n"),
             program_name);

Because this uses ` and ' in the source file (following part 2 of the
above advice), it will also use ` and ' in the C locale (contrary to
the advice's part 1).  On the other hand, if we change it to:

    fprintf (stderr, _("Try \"%s --help\" for more information.\n"),
             program_name);

it will use " in the C locale (following part 1) but " in the source
code (contrary to part 2).

A simple way to fix this problem is to remove part 2 of the advice.

I see some other issues two.

 * standards.texi is sometimes not following its own new advice with
   respect to ` and '.

 * When printed in PDF form, ` is printed as
   open single quote even though the text is talking about the ASCII
   accent-grave, which makes things pretty confusing.

 * An example uses printf with quotes in a way that can cause printf
   to dump core.

 * Since the `...' style is so entrenched in GNU programs, if we're
   going to change it, the style manual should briefly mention the old
   style and why it changed.

Here's a proposed patch that attempts to address the above issues.

Index: standards.texi
===================================================================
RCS file: /sources/gnustandards/gnustandards/standards.texi,v
retrieving revision 1.210
diff -c -r1.210 standards.texi
*** standards.texi      23 Dec 2011 00:30:29 -0000      1.210
--- standards.texi      23 Dec 2011 05:50:22 -0000
***************
*** 2374,2380 ****
  * System Functions::            Portability and ``standard'' library 
functions.
  * Internationalization::        Techniques for internationalization.
  * Character Set::               Use ASCII by default.
! * Quote Characters::            Use `...' in the C locale.
  * Mmap::                        How you can safely use @code{mmap}.
  @end menu
  
--- 2374,2380 ----
  * System Functions::            Portability and ``standard'' library 
functions.
  * Internationalization::        Techniques for internationalization.
  * Character Set::               Use ASCII by default.
! * Quote Characters::            Use "..." or '...' in the C locale.
  * Mmap::                        How you can safely use @code{mmap}.
  @end menu
  
***************
*** 3049,3060 ****
  around each string that might need translation---like this:
  
  @example
! printf (gettext ("Processing file `%s'..."));
  @end example
  
  @noindent
  This permits GNU gettext to replace the string @code{"Processing file
! `%s'..."} with a translated version.
  
  Once a program uses gettext, please make a point of writing calls to
  @code{gettext} when you add new strings that call for translation.
--- 3049,3060 ----
  around each string that might need translation---like this:
  
  @example
! printf (gettext ("Processing file \"%s\"..."), file);
  @end example
  
  @noindent
  This permits GNU gettext to replace the string @code{"Processing file
! \"%s\"..."} with a translated version.
  
  Once a program uses gettext, please make a point of writing calls to
  @code{gettext} when you add new strings that call for translation.
***************
*** 3130,3136 ****
  
  @noindent
  The problem with this example is that it assumes that plurals are made
! by adding `s'.  If you apply gettext to the format string, like this,
  
  @example
  printf (gettext ("%d file%s processed"), nfiles,
--- 3130,3136 ----
  
  @noindent
  The problem with this example is that it assumes that plurals are made
! by adding ``s''.  If you apply gettext to the format string, like this,
  
  @example
  printf (gettext ("%d file%s processed"), nfiles,
***************
*** 3139,3145 ****
  
  @noindent
  the message can use different words, but it will still be forced to use
! `s' for the plural.  Here is a better way, with gettext being applied to
  the two strings independently:
  
  @example
--- 3139,3145 ----
  
  @noindent
  the message can use different words, but it will still be forced to use
! ``s'' for the plural.  Here is a better way, with gettext being applied to
  the two strings independently:
  
  @example
***************
*** 3185,3218 ****
  @cindex quote characters
  @cindex locale-specific quote characters
  @cindex left quote
  @cindex grave accent
  
! In the C locale, GNU programs should stick to plain ASCII for quotation
! characters in messages to users: preferably 0x60 (@samp{`}) for left
! quotes and 0x27 (@samp{'}) for right quotes.  It is ok, but not
! required, to use locale-specific quotes in other locales.
! 
! The @uref{http://www.gnu.org/software/gnulib/, Gnulib} @code{quote} and
! @code{quotearg} modules provide a reasonably straightforward way to
! support locale-specific quote characters, as well as taking care of
! other issues, such as quoting a filename that itself contains a quote
! character.  See the Gnulib documentation for usage details.
! 
! In any case, the documentation for your program should clearly specify
! how it does quoting, if different than the preferred method of @samp{`}
! and @samp{'}.  This is especially important if the output of your
! program is ever likely to be parsed by another program.
! 
! Quotation characters are a difficult area in the computing world at
! this time: there are no true left or right quote characters in Latin1;
! the @samp{`} character we use was standardized there as a grave
! accent.  Moreover, Latin1 is still not universally usable.
  
! Unicode contains the unambiguous quote characters required.  However,
! Unicode and UTF-8 are not universally well-supported, either.
  
! This may change over the next few years, and then we will revisit
! this.
  
  
  @node Mmap
--- 3185,3234 ----
  @cindex quote characters
  @cindex locale-specific quote characters
  @cindex left quote
+ @cindex right quote
+ @cindex opening quote
+ @cindex single quote
+ @cindex double quote
  @cindex grave accent
+ @set txicodequoteundirected
+ @set txicodequotebacktick
  
! In the C locale, the output of GNU programs should stick to plain
! ASCII for quotation characters in messages to users: preferably 0x22
! (@samp{"}) or 0x27 (@samp{'}) for both opening and closing quotes.
! Although GNU programs traditionally used 0x60 (@samp{`}) for opening
! and 0x27 (@samp{'}) for closing quotes, nowadays quotes @samp{`like
! this'} are typically rendered asymmetrically, so quoting @samp{"like
! this"} or @samp{'like this'} typically looks better.
  
! It is ok, but not required, for GNU programs to generate
! locale-specific quotes in non-C locales.  For example:
  
! @example
! printf (gettext ("Processing file \"%s\"..."), file);
! @end example
! 
! @noindent
! Here a French translation might cause @code{gettext} to return the
! string @code{"Traitement de fichier
! @address@hidden@address@hidden"}, yielding quotes
! more appropriate for a French locale.
! 
! Sometimes a program may need to use opening and closing quotes
! directly.  By convention, @code{gettext} translates the string
! @samp{"`"} to the opening quote and the string @samp{"'"} to the
! closing quote, and a program can use these translations.  Generally,
! though, it is better to translate quote characters in the context of
! longer strings.
! 
! If the output of your program is ever likely to be parsed by another
! program, it is good to provide an option that makes this parsing
! reliable.  For example, you could escape special characters using
! conventions from the C language or the Bourne shell.  See for example
! the option @option{--quoting-style} of GNU @code{ls}.
! 
! @clear txicodequoteundirected
! @clear txicodequotebacktick
  
  
  @node Mmap
***************
*** 3585,3591 ****
  
  @example
  * keyboard.c (menu_bar_items, tool_bar_items)
! (Fexecute_extended_command): Deal with `keymap' property.
  @end example
  
  When you install someone else's changes, put the contributor's name in
--- 3601,3607 ----
  
  @example
  * keyboard.c (menu_bar_items, tool_bar_items)
! (Fexecute_extended_command): Deal with keymap property.
  @end example
  
  When you install someone else's changes, put the contributor's name in




reply via email to

[Prev in Thread] Current Thread [Next in Thread]