[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: setlocale() [Was: Re: NSNumberFormater test fails]
From: |
Eric Wasylishen |
Subject: |
Re: setlocale() [Was: Re: NSNumberFormater test fails] |
Date: |
Mon, 27 Feb 2012 23:49:43 -0700 |
Hi,
I committed a slightly modified version of locale3.diff.
This fixes the NSNumberFormatter test failures when a non-English locale is
used (I tested French).
Let me know if you have any problems with this.
-Eric
On 2012-02-12, at 6:25 PM, Robert Slover wrote:
> Honestly this seem right to me, or at least safest, particularly for library
> code. There are too many fundamental APIs that encapsulate the current
> locale as part of their internal state without being able to preserve it. A
> good example would be a compiled regular expression -- bad things happen if
> the locale changes after the regular expression is parsed and compiled but
> before it is used. This is one of the current weaknesses of POSIX, IMHO.
>
> --Robert
>
> On Feb 12, 2012, at 15:31, Fred Kiefer <address@hidden> wrote:
>
>> That means you were right, Cocoa doesn't call setlocale(LC_ALL, "")
>> automatically. I am actually surprised by that, but if they don't do neither
>> should we.
>>
>> Fred
>>
>> On 11.02.2012 00:40, Eric Wasylishen wrote:
>>> Btw, I attached a test program test.m which shows how the current AppKit
>>> locale (and libc locale) affects various ways of printing decimal points:
>>>
>>> Mac OS 7.2, locale set to French Canadian in SystemPreferences:
>>> new-host-2:~ ericw$ echo $LANG
>>> fr_CA.UTF-8
>>> new-host-2:~ ericw$ gcc test.m -framework Foundation -framework AppKit
>>> new-host-2:~ ericw$ ./a.out
>>> 2012-01-29 17:52:58.642 a.out[1215:707] Launched. current locale: fr_CA
>>> 2012-01-29 17:52:58.643 a.out[1215:707] NSLog Decimal test: 1.23
>>> printf decimal test: 1.23
>>> 2012-01-29 17:52:58.644 a.out[1215:707] Calling setlocale(LC_ALL, "")...
>>> 2012-01-29 17:52:58.645 a.out[1215:707] NSLog bonjour! 1.23
>>> printf bonjour! 1,23
>>> 2012-01-29 17:52:58.645 a.out[1215:707] -[NSString stringWithFormat:]: 1.23
>>> ^C
>>>
>>> GNUstep trunk:
>>> address@hidden:~$ export LC_ALL=fr_CA.UTF-8
>>> address@hidden:~$ gcc `gnustep-config --objc-flags` test.m `gnustep-config
>>> --gui-libs` -o test
>>> address@hidden:~$ ./test
>>> 2012-02-10 16:16:32.203 test[14990] Launched. current locale: fr_CA
>>> 2012-02-10 16:16:32.210 test[14990] NSLog Decimal test: 1.23
>>> printf decimal test: 1,23
>>> 2012-02-10 16:16:32.211 test[14990] Calling setlocale(LC_ALL, "")...
>>> 2012-02-10 16:16:32.211 test[14990] NSLog bonjour! 1.23
>>> printf bonjour! 1,23
>>> 2012-02-10 16:16:32.211 test[14990] -[NSString stringWithFormat:]: 1.23
>>>
>>> The only difference is the first "printf decimal test:" on GNUstep uses the
>>> comma decimal separator, because of the setlocale called by +[NSObject
>>> initialize]. On Mac OS, the libc locale is still "C".
>>>
>>> One interesting thing is neither NSLog nor -[NSString stringWithFormat:],
>>> on Cocoa or GNUstep, use the locale's decimal point (regardless of the
>>> setting of the AppKit locale, or the libc locale.)
>>>
>>>
>>> On 2012-02-08, at 12:29 PM, Fred Kiefer wrote:
>>>
>>>> I think you are right as far as ICU is concerned, when we use ICU we
>>>> should use the function uloc_setDefault() to select the locale we want.
>>>> But currently the code we have is not ICU only, it works with a mixture of
>>>> ICU and glibc and it should work without ICU. We could try to make sure
>>>> that all our calls that need locale information of any sort, go through a
>>>> wrapper that uses the corresponding ICU function when that is available.
>>>> If that is achieved we could only use uloc_setDefault(), when ICU gets
>>>> used and everything should work. (And fall back to the old setlocale()
>>>> call in the other case.
>>>> But is this achievable? Who is willing to check which of our used glibc
>>>> functions use any locale information? And to rewrite all of these? Just
>>>> think of the work in NSLog and the removal of all printf calls and the
>>>> like.
>>>
>>> Actually, I don't think there would be much work to do. From what I've
>>> seen, gnustep-base doesn't use the libc locale system much (if at all). For
>>> example, GSFormat.m uses NSLocale to get the decimal separator character
>>> (it does have fallback code to print a decimal using printf, but from what
>>> I can see that will never get used.)
>>>
>>> The only place I've found the libc locale used is in GSLocale.m, in the
>>> implementation of GSDomainFromDefaultLocale().
>>>
>>> I attached a "first try" at a patch which does the following:
>>>
>>> - deprecates GSSetLocale and GSSetLocaleC - they now do nothing.
>>> - removes the call to GSSetLocaleC in +[NSObject initialize].
>>> - adds a function to GSLocale.m, GSDefaultLanguageLocale(), which returns
>>> the locale for LC_MESSAGES. refactors two parts of NSUserDefaults that were
>>> calling GSSetLocale to get LC_MESSAGES to use GSDefaultLanguageLocale()
>>> instead.
>>> - rewrite parts of GSLocale.m which need to use the libc locale. Now they
>>> call setlocale(LC_ALL, ""), and afterwards restore the C locale to what it
>>> was previously.
>>>
>>> Of course if we were to apply this I would want to do a more careful scan
>>> of base for uses of the libc locale.
>>>
>>>> I am not that sure whether the selection of the locale is really up to the
>>>> application code.
>>>> When an application gets started it has a right to expect that its
>>>> supporting libraries are set up to a sensible default. This may not be
>>>> true for tools, but should be the case for applications that display a
>>>> user interface.
>>>
>>> I agree with your basic point… but I would expect the GNUstep locale system
>>> to be set up, not the libc locale.
>>>
>>> Eric
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>> On 08.02.2012 19:25, Eric Wasylishen wrote:
>>>>> Hi,
>>>>> I just had a look in to this problem. While it sounds like there is
>>>>> certainly a bug in libicu - it should not break if the libc locale is
>>>>> changed - I am very skeptical that setting the libc locale as we do in
>>>>> +[NSObject initialize] (or anywhere else... IIRC it's also done in
>>>>> NSUserDefaults) is a good idea.
>>>>>
>>>>> Just to recap, +[NSObject initialize] does setlocale(LC_ALL, ""); which
>>>>> reads the current locale from the LANG environment variable (and
>>>>> others)[1] and sets all of the libc locale settings to that locale - so
>>>>> after +[NSObject initiazlize], printf("%g", 1.23) will output "1,23" if
>>>>> your system locale is French, for example.
>>>>>
>>>>> My main problem with this is, I don't think any shared library really has
>>>>> the right to change this setting… if an application/tool wanted to switch
>>>>> from the default C locale to the current system locale, that should be
>>>>> the application's decision, since it has global consequences for
>>>>> everything running in that process (changing the semantics of printf!).
>>>>> But there would hardly be a point to doing that because GNUstep provides
>>>>> more powerful formatting anyway (NSNumberFormatter, etc.)
>>>>>
>>>>>> The "official" way of setting the locale in ICU is using
>>>>>> uloc_setDefault()
>>>>>
>>>>> According to the ICU docs, the notion of locale in ICU is totally
>>>>> independent of libc's. For number formatting, the ICU default locale only
>>>>> has an effect if you pass NULL for the locale when calling unum_open.
>>>>>
>>>>> So setting the libc locale should have no effect on ICU's default locale
>>>>> (not true because of the bug mentioned below), and vice-versa - setting
>>>>> the ICU locale has no effect on the system locale.
>>>>>
>>>>> Eric
>>>>>
>>>>> [1] actually more complicated, at least for glibc:
>>>>> http://www.gnu.org/software/libc/manual/html_mono/libc.html#Locale-Categories
>>>>>
>>>>> On 2012-01-23, at 6:43 AM, Stefan Bidi wrote:
>>>>>
>>>>>> On Mon, Jan 23, 2012 at 3:01 AM, Fred Kiefer<address@hidden> wrote:
>>>>>> That bug description is not accurate. When running the NSNumberformatter
>>>>>> test program we only call setlocale() twice, once with "" and once with
>>>>>> NULL as the locale. That would be supported behaviour according to the
>>>>>> bug description, but clearly it is not.
>>>>>>
>>>>>> I'll attach that modified version to the bug report.
>>>>>>
>>>>>> I changed your test program to call setlocale() and now it also reports
>>>>>> NaN (See attachment). This really makes me wonder whether it is such a
>>>>>> great idea to use an internationalisation library that only supports
>>>>>> English :-(
>>>>>>
>>>>>> In all fairness, it has been classified as bug in the library. The
>>>>>> "official" way of setting the locale in ICU is using uloc_setDefault().
>>>>>> To get the locale it's uloc_getDefault(). But I see your point and am a
>>>>>> little surprised that this is even an issue. I'm even more surprised
>>>>>> that it keeps getting pushed off to a later release. It seems to have
>>>>>> originally been scheduled for 4.6, then 4.8 and now 5.0.
>>>>>>
>>>>>> BTW: Is there a reason why the macro STRING_FROM_NUMBER calls the
>>>>>> conversion twice, even when it was successful on the first attempt? I
>>>>>> don't like the use of macros that much it really makes it hard to tell
>>>>>> what is going on and code in macros never gets as much review as normal
>>>>>> code.
>>>>>>
>>>>>> No, it's wrong. I saw that when mucking around in there, too, but
>>>>>> didn't a fix commit (I'll do so when I get home, today). Don't ask me
>>>>>> how I managed to screw that up... I don't know either.
>>>>>>
>>>>>> On 21.01.2012 23:00, Stefan Bidi wrote:
>>>>>> After running a few more tests and still not understanding what is going
>>>>>> on
>>>>>> I went to good and found this bug report:
>>>>>> http://bugs.icu-project.org/trac/ticket/8214
>>>>>>
>>>>>> Seems that ICU does not like it when we use setlocale().
>>>>>>
>>>>>> On Sat, Jan 21, 2012 at 1:53 PM, Stefan Bidi<address@hidden> wrote:
>>>>>>
>>>>>> I am completely baffled by this bug. I've been trying to debug this for
>>>>>> the last 3 hrs and have gotten absolutely no where. I added a unum_open
>>>>>> and unum_formatDouble call in -init and I still get NaN when
>>>>>> LANG=de_DE.UTF-8. The test program continues to work without a hitch,
>>>>>> though. Something about how we handle the NSNumberFormatterInternal
>>>>>> structure is screwing up UNumberFormat (I also added a unum_open and
>>>>>> unum_formatDouble call in basic10_4.m and it worked fine).
>>>>>>
>>>>>>
>>>>>> On Sat, Jan 21, 2012 at 11:00 AM, Stefan Bidi<address@hidden>wrote:
>>>>>>
>>>>>> On Sat, Jan 21, 2012 at 10:41 AM, Fred Kiefer<address@hidden> wrote:
>>>>>>
>>>>>> Your test code works fine here and results in 1.234 as expected. My
>>>>>> $LANG is de_DE.UTF-8. And with $LANG set to C the test run fine.
>>>>>> Strange enough currentLocale ends up being en_US_POSIX
>>>>>>
>>>>>>
>>>>>> -currentLocale looks for a Locale default, if it exists that's what it
>>>>>> uses. Do you have that set?
>>>>>>
>>>>>> This still wouldn't explain why UNumberFormat is returning NaN. Both you
>>>>>> and Philippe have a valid locale. On the plus side, if I set
>>>>>> LANG=de_DE.UTF-8 I can reproduce this. I'll go try to figure out what's
>>>>>> going on.
>>
>> _______________________________________________
>> Gnustep-dev mailing list
>> address@hidden
>> https://lists.gnu.org/mailman/listinfo/gnustep-dev
>
> _______________________________________________
> Gnustep-dev mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/gnustep-dev