chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] Codepoint indices for matched regexps (UTF-8)?


From: John Cowan
Subject: Re: [Chicken-users] Codepoint indices for matched regexps (UTF-8)?
Date: Fri, 15 Jun 2018 19:00:25 -0400



On Fri, Jun 15, 2018 at 9:44 AM, Henry Hu <address@hidden> wrote:

I tried (use utf8), but it is documented that it doesn't affect irregex and it sure enough doesn't.  I tried using the 'utf8 option while compiling my regex, but it doesn't change the index returned by irregex-match-start-index. 

Do "(use utf8)" and then "(import utf8-lolevel)" to get the (undocumented) low-level utf8 API.  The function utf8-offset->index accepts a string and a byte offset and returns a codepoint index.  If you want to go the other way, utf8-index->offset is also provided.

-- 
John Cowan          http://vrici.lojban.org/~cowan        address@hidden
I don't know half of you half as well as I should like, and I like less
than half of you half as well as you deserve.  --Bilbo


reply via email to

[Prev in Thread] Current Thread [Next in Thread]