[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: GNU grep,awk,sed: support \u and \U for unicode
From: |
Assaf Gordon |
Subject: |
Re: GNU grep,awk,sed: support \u and \U for unicode |
Date: |
Thu, 19 Jan 2017 15:18:15 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 |
Hello Eli,
On 01/19/2017 11:26 AM, Eli Zaretskii wrote:
>> From: Assaf Gordon <address@hidden>
[...]
>>
>> In case of cygwin/mingw, an extra step of converting the 'uint32' to two
>> 'uint16' is needed,
>> and then two calls for wctomb are needed.
>
> I don't see how this could work: AFAIK the MS-Windows wctomb accepts a
> single wchar_t value, so it can only support Unicode codepoints inside
> the BMP. You cannot call it with 2 wchar_t values one after the other
> to get support for the full Unicode range. (This is relevant to
> MinGW; I think Cygwin doesn't have this problem.)
Thank you for pointing this out.
It's likely I mixed-up wctomb with another function.
While I'm quite certain cygwin has mechanism to deal with it,
I'll have to double-check about mingw.
I'll investigate and write back.
-assaf
- GNU grep,awk,sed: support \u and \U for unicode, Assaf Gordon, 2017/01/10
- Re: GNU grep,awk,sed: support \u and \U for unicode, Assaf Gordon, 2017/01/11
- Re: [Grep-devel] GNU grep,awk,sed: support \u and \U for unicode, Paul Eggert, 2017/01/11
- Re: [bug-gawk] GNU grep,awk,sed: support \u and \U for unicode, arnold, 2017/01/11
- Re: [bug-gawk] GNU grep,awk,sed: support \u and \U for unicode, Norihiro Tanaka, 2017/01/19