emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CSV parsing and other issues (Re: LC_NUMERIC)


From: Maxim Nikulin
Subject: Re: CSV parsing and other issues (Re: LC_NUMERIC)
Date: Thu, 17 Jun 2021 00:27:49 +0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.8.1

On 15/06/2021 00:19, Eli Zaretskii wrote:
>> From: Maxim Nikulin Date: Mon, 14 Jun 2021 23:38:19 +0700
>>>> You forgot `setlocale(LC_NUMERIC, "C")', didn't you?
>>>
>>> No, I didn't.  Adding a call to setlocale to locale-info, even if we
>>> want to add an argument for the caller to control the locale, is
>>> trivial.
>>
>> I would avoid such manipulations and the reason is not efficiency of
>> particular implementation.
>
> But we already do that in locale-info, for locale categories other
> than LC_NUMERIC.

I have seen it call for collation. It may be reasonable in past (e.g. as quick plumbing), but I thunk such things should be avoided for the sake of thread safety. Moreover, you are crying that implementations other than glibc are inefficient.

Proper instruments for concurrency and parallel execution may alleviate
issues like the following:
https://lists.gnu.org/archive/html/emacs-devel/2021-05/msg01297.html
> I hear quite a few people run at least two instances of
> Emacs, for example if they don't want Gnus fetching new
> articles and email to freeze the interactive session for
> prolonged times.

Which property will help here? we don't have such properties.  they
need to be designed and implemented.
Let's name it "locale". Its value is some object that represents either a "solid" locale such as de_DE or combined LC_NUMERIC=en_GB + LC_TIME=de_DE + default fr_FR. Data required for particular operations may be loaded on demand.

How do you associate such an object with text of a buffer or a string
such that different parts of the text could have different "locales"
(as required for a multi-lingual editor such as Emacs)?

I already suggested some variants and you did not argue.

Technically it can be done through `set-text-properties'. If there are no such text properties than it may be assumed that no fine grain tuning is requires, so buffer-local variables or global environment are used. Language may be guessed from code points of characters. Particular modes may either inhibit localization for program code or extract necessary information from HTML lang attributes, arguments of LaTeX \foreignlanguage macro, etc.

In my opinion, Emacs is not really multi-lingual yet due to limitations and inconveniences. Some other software demonstrated significantly greater progress during last decade. Maybe achieving current level was so painful that you are prefer to avoid touching of related code for any reason, not to speak of various improvements.

 > And even if we had locale-downcase, which locale would you
 > pass to it in any given use case?

I already mentioned responsibility chain: explicit value or set of overrides passed by user, text property for particular span of characters, buffer-local variables, global environment variables. Locale may be instantiated from its name "it_IT". Convenience functions to obtain locale at point likely will be useful as well. (Actually I am assuming number parsing-formatting rather than case conversion.)

I am aware that such features do not exist yet. Only libc is available, but we consider it as inappropriate (you due to performance issues, me due to thread safety and possible bugs due to missed calls restoring old state). You are against using of CLDR detailed info for locales through ICU due to alternative implementation of Unicode character tables (another part of ICU) already exists in Emacs. At the same time you are refusing any attempts to discuss possible extensions from any side: low level base functions taking locale as explicit argument or high level requirements what interface can be useful to "implicitly" derive locale of particular part of text (actually text prepared for intelligent handling of locales).

Certainly with position "locale-aware formatting can not be implemented because Emacs has no necessary infrastructure and such feature is needed by only a handful of user" there is no way to improve anything.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]