guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] fix locale string reading


From: Mark H Weaver
Subject: Re: [PATCH] fix locale string reading
Date: Mon, 14 Nov 2011 19:02:08 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.90 (gnu/linux)

Hi Ludovic!

address@hidden (Ludovic Courtès) writes:
>> I think we should consider decoding the command-line arguments using the
>> locale specified by the environment variables, at least in cases like
>> this where there's no way for the user to call setlocale before the
>> conversion happens.
>
> Below is a patch that does roughly that (we should get ‘locale_encoding’
> reviewed and perhaps added to Gnulib.)
>
> It solves the problem:
>
> # With the patch.
> $ ./meta/guile -c '(setlocale LC_ALL "en_US.UTF8")(display (command-line))' 
> -- λ
> (/home/ludo/src/guile/libguile/.libs/guile -- λ)
>
> # Previously.
> $ guile -c '(setlocale LC_ALL "en_US.UTF8")(display (command-line))' -- λ
> (guile -- ??)

Looks great, thanks! :)

I have one question though.  You fixed scm_compile_shell_switches, but I
see another place where command-line arguments are converted to Scheme
strings before the user is able to call setlocale: guile.c and init.c.

main (guile.c) calls scm_boot_guile (init.c), which uses
invoke_main_func (init.c), which calls scm_set_program_arguments
(feature.c).  Does this code need to be fixed also?

    Thanks,
      Mark


> diff --git a/libguile/script.c b/libguile/script.c
> index 5e0685a..20d7b9e 100644
> --- a/libguile/script.c
> +++ b/libguile/script.c
> @@ -26,6 +26,7 @@
>  #include <stdio.h>
>  #include <errno.h>
>  #include <ctype.h>
> +#include <uniconv.h>
>  
>  #include "libguile/_scm.h"
>  #include "libguile/eval.h"
> @@ -368,6 +369,74 @@ scm_shell_usage (int fatal, char *message)
>                 : SCM_BOOL_F));
>  }
>  
> +/* Return the name of the locale encoding suggested by environment
> +   variables, even if it's not current, or NULL if no encoding is
> +   defined.  Based on Gnulib's `localcharset.c'.  */
> +static const char *
> +locale_encoding (void)
> +{
> +  const char *locale, *codeset = NULL;
> +
> +  /* Allow user to override the codeset, as set in the operating system,
> +     with standard language environment variables.  */
> +  locale = getenv ("LC_ALL");
> +  if (locale == NULL || locale[0] == '\0')
> +    {
> +      locale = getenv ("LC_CTYPE");
> +      if (locale == NULL || locale[0] == '\0')
> +        locale = getenv ("LANG");
> +    }
> +  if (locale != NULL && locale[0] != '\0')
> +    {
> +      /* If the locale name contains an encoding after the dot, return it.  
> */
> +      const char *dot = strchr (locale, '.');
> +
> +      if (dot != NULL)
> +        {
> +       static char buf[2 + 10 + 1];
> +          const char *modifier;
> +
> +          dot++;
> +          /* Look for the possible @... trailer and remove it, if any.  */
> +          modifier = strchr (dot, '@');
> +          if (modifier == NULL)
> +            return dot;
> +          if (modifier - dot < sizeof (buf))
> +            {
> +              memcpy (buf, dot, modifier - dot);
> +              buf [modifier - dot] = '\0';
> +              return buf;
> +            }
> +        }
> +
> +      /* Resolve through the charset.alias file.  */
> +      codeset = locale;
> +    }
> +
> +  return codeset;
> +}
> +
> +/* Return a list of strings from ARGV, which contains ARGC strings
> +   assumed to be encoded in the current locale.  Use `locale_charset'
> +   instead of relying on `scm_from_locale_string' because the user
> +   hasn't had a change to call (setlocale LC_ALL "") yet.  */
> +static SCM
> +locale_arguments_to_string_list (int argc, char **const argv)
> +{
> +  int i;
> +  SCM lst;
> +  const char *encoding;
> +
> +  encoding = locale_encoding ();
> +  for (i = argc - 1, lst = SCM_EOL;
> +       i >= 0;
> +       i--)
> +    lst = scm_cons (scm_from_stringn (argv[i], (size_t) -1, encoding,
> +                                   SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE),
> +                 lst);
> +
> +  return lst;
> +}
>  
>  /* Given an array of command-line switches, return a Scheme expression
>     to carry out the actions specified by the switches.
> @@ -378,7 +447,7 @@ scm_compile_shell_switches (int argc, char **argv)
>  {
>    return scm_call_2 (scm_c_public_ref ("ice-9 command-line",
>                                         "compile-shell-switches"),
> -                     scm_makfromstrs (argc, argv),
> +                  locale_arguments_to_string_list (argc, argv),
>                       (scm_usage_name
>                        ? scm_from_locale_string (scm_usage_name)
>                        : scm_from_latin1_string ("guile")));



reply via email to

[Prev in Thread] Current Thread [Next in Thread]