[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] Enable utf8->string to take a range
From: |
Maxime Devos |
Subject: |
Re: [PATCH] Enable utf8->string to take a range |
Date: |
Fri, 21 Jan 2022 23:08:25 +0100 |
User-agent: |
Evolution 3.38.3-1 |
Vijay Marupudi schreef op vr 21-01-2022 om 15:20 [-0500]:
+ (pass-if-exception "utf8->string range: end < start"
+ exception:out-of-range
+ (let* ((utf8 (string->utf8 "gnu guile")))
+ (utf8->string utf8 1 0)))
+ [other tests]
It would be nice to check multibyte characters as well,
to verify that byte indices and not character indices are used.
E.g., (utf8->string #vu8(195 169) 0 2) should return "é".
Another nice test: (utf8->string #vu8(195 169) 0 1) should raise
a 'decoding-error', even though #vu8(195 169) is valid UTF-8.
And (utf8->string #vu8(0 32 196) 0 2) should return "\x00 " even
though #vu8(0 32 195) is invalid UTF-8 -- and as a bonus, it checks
that the nul character is supported -- which can be easily forgotten
because Guile is implemented in C which usually terminates strings
by zero instead of using a length field.
Overall, the patch you sent seems a reasonable approach to me, though
I didn't verify the details. I find myself at times copying a part
of a bytevector to a new bytevector because some procedure doesn't
allow specifying byte ranges ...
Greetings,
Maxime
signature.asc
Description: This is a digitally signed message part