guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Problems with locale-dependent number parsing


From: Andreas Ettner
Subject: Problems with locale-dependent number parsing
Date: Mon, 9 May 2022 12:28:47 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0

Dear Guile maintainers,

I want to report various problems with locale-dependent number
parsing in Guile version 3.0.8 (other versions, e. g. 2.2.7, have these issues, too). Furthermore I want to propose a patch resolving these problems.

First consider the problems with function
‘locale-string->integer’:

------------------------------------------------------------
(use-modules (ice-9 i18n))

(substring "12" 0 1)
⇒ "1"

(locale-string->integer "1"
                        10
                        (make-locale LC_ALL "C"))
⇒ 1
⇒ 1

(locale-string->integer (substring "12" 0 1)
                        10
                        (make-locale LC_ALL "C"))
⇒ 12 ; expected 1
⇒ 2  ; expected 1
------------------------------------------------------------

This problem is caused by the erroneous handling of substrings
in function ‘locale-string->integer’.

Moreover ‘locale-string->integer’ throws an exception of
"Invalid read access of chars of wide string" when called with
a wide string as its first argument.

An especially weird example is:

------------------------------------------------------------
(use-modules (ice-9 i18n))

(substring "1\u0100" 0 1)
⇒ "1"

(locale-string->integer "1" 10 (make-locale LC_ALL "C"))
⇒ 1
⇒ 1

(locale-string->integer (substring "1\u0100" 0 1)
                        10
                        (make-locale LC_ALL "C"))
⊣ ice-9/boot-9.scm:1685:16: In procedure raise-exception:
  Invalid read access of chars of wide string: "1"
  ; expected values 1 and 1
------------------------------------------------------------

The function ‘locale-string->inexact’ has similar problems:

------------------------------------------------------------
(use-modules (ice-9 i18n))

(substring "0.5625" 0 3)
⇒ "0.5"

(locale-string->inexact "0.5"
                        (make-locale LC_ALL "C"))
⇒ 0.5
⇒ 3

(locale-string->inexact (substring "0.5625" 0 3)
                        (make-locale LC_ALL "C"))
⇒ 0.5625 ; expected 0.5
⇒ 6      ; expected 3
------------------------------------------------------------

This problem is caused by the erroneous handling of substrings
in function ‘locale-string->inexact’.

Moreover ‘locale-string->inexact’ throws an exception of
"Invalid read access of chars of wide string" when called
with a wide string as its first argument.

An especially weird example is:

------------------------------------------------------------
(use-modules (ice-9 i18n))

(substring "1.25\u0100" 0 4)
⇒ "1.25"

(locale-string->inexact "1.25" (make-locale LC_ALL "C"))
⇒ 1.25
⇒ 4

(locale-string->inexact (substring "1.25\u0100" 0 4)
                        (make-locale LC_ALL "C"))
⊣ ice-9/boot-9.scm:1685:16: In procedure raise-exception:
  Invalid read access of chars of wide string: "1.25"
  ; expected values 1.25 and 4
------------------------------------------------------------

A proposal for a patch (based on Guile 3.0.8) resolving these
issues and accompanying tests is attached to this message. In
function ‘scm_locale_string_to_integer’ a check that the parameter ‘base’ (if provided) is 0 or an integer between 2 and 36 has been added, as this is required by the functions ‘strtol’ resp. ‘wcstol’.

No assumption about the relationship between the types ‘scm_t_wchar’ and ‘wchar_t’ has been made for the sake of portability. The proposal is a bit long -- please feel free
to pick what you see fit.


Best regards,

Andreas Ettner

Attachment: patch.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]