[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Problems with locale-dependent number parsing
From: |
Andreas Ettner |
Subject: |
Problems with locale-dependent number parsing |
Date: |
Mon, 9 May 2022 12:28:47 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 |
Dear Guile maintainers,
I want to report various problems with locale-dependent number
parsing in Guile version 3.0.8 (other versions, e. g. 2.2.7, have these
issues, too). Furthermore I want to propose a patch resolving these problems.
First consider the problems with function
‘locale-string->integer’:
------------------------------------------------------------
(use-modules (ice-9 i18n))
(substring "12" 0 1)
⇒ "1"
(locale-string->integer "1"
10
(make-locale LC_ALL "C"))
⇒ 1
⇒ 1
(locale-string->integer (substring "12" 0 1)
10
(make-locale LC_ALL "C"))
⇒ 12 ; expected 1
⇒ 2 ; expected 1
------------------------------------------------------------
This problem is caused by the erroneous handling of substrings
in function ‘locale-string->integer’.
Moreover ‘locale-string->integer’ throws an exception of
"Invalid read access of chars of wide string" when called with
a wide string as its first argument.
An especially weird example is:
------------------------------------------------------------
(use-modules (ice-9 i18n))
(substring "1\u0100" 0 1)
⇒ "1"
(locale-string->integer "1" 10 (make-locale LC_ALL "C"))
⇒ 1
⇒ 1
(locale-string->integer (substring "1\u0100" 0 1)
10
(make-locale LC_ALL "C"))
⊣ ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Invalid read access of chars of wide string: "1"
; expected values 1 and 1
------------------------------------------------------------
The function ‘locale-string->inexact’ has similar problems:
------------------------------------------------------------
(use-modules (ice-9 i18n))
(substring "0.5625" 0 3)
⇒ "0.5"
(locale-string->inexact "0.5"
(make-locale LC_ALL "C"))
⇒ 0.5
⇒ 3
(locale-string->inexact (substring "0.5625" 0 3)
(make-locale LC_ALL "C"))
⇒ 0.5625 ; expected 0.5
⇒ 6 ; expected 3
------------------------------------------------------------
This problem is caused by the erroneous handling of substrings
in function ‘locale-string->inexact’.
Moreover ‘locale-string->inexact’ throws an exception of
"Invalid read access of chars of wide string" when called
with a wide string as its first argument.
An especially weird example is:
------------------------------------------------------------
(use-modules (ice-9 i18n))
(substring "1.25\u0100" 0 4)
⇒ "1.25"
(locale-string->inexact "1.25" (make-locale LC_ALL "C"))
⇒ 1.25
⇒ 4
(locale-string->inexact (substring "1.25\u0100" 0 4)
(make-locale LC_ALL "C"))
⊣ ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Invalid read access of chars of wide string: "1.25"
; expected values 1.25 and 4
------------------------------------------------------------
A proposal for a patch (based on Guile 3.0.8) resolving these
issues and accompanying tests is attached to this message. In
function ‘scm_locale_string_to_integer’ a check that the parameter ‘base’
(if provided) is 0 or an integer between
2 and 36 has been added, as this is required by the functions ‘strtol’ resp.
‘wcstol’.
No assumption about the relationship between the types ‘scm_t_wchar’ and
‘wchar_t’ has been made for the sake of portability. The proposal is a bit
long -- please feel free
to pick what you see fit.
Best regards,
Andreas Ettner
patch.txt
Description: Text document
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Problems with locale-dependent number parsing,
Andreas Ettner <=