[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filename
From: |
Josef Möllers |
Subject: |
Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375) |
Date: |
Tue, 21 Aug 2018 08:06:42 +0000 |
Can I take this? I have already looked at it and found this:
>>>
The number of characters can be counted in C in a portable way using
mbstowcs(NULL,s,0). This works for UTF-8 like for any other supported encoding,
as long as the appropriate locale has been selected. A hard-wired technique to
count the number of characters in a UTF-8 string is to count all bytes except
those in the range 0x80 – 0xBF, because these are just continuation bytes and
not characters of their own. However, the need to count characters arises
surprisingly rarely in applications.
[http://www.cl.cam.ac.uk/~mgk25/unicode.html#mod]
keep in mind that counting codepoints will give the wrong answer if combining
characters are involved; even normalizint the input won't help as there are
graphemes which do not map to single codepoints...
[https://stackoverflow.com/questions/5117393/number-of-character-cells-used-by-string]
<<<
--
Reply to this email directly or view it on GitLab:
https://gitlab.com/gnuwget/wget2/issues/375#note_95769646
You're receiving this email because of your account on gitlab.com.
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375),
Josef Möllers <=
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Tim Rühsen, 2018/08/21
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Tim Rühsen, 2018/08/21
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Tim Rühsen, 2018/08/21
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Josef Möllers, 2018/08/21
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Tim Rühsen, 2018/08/21
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Josef Möllers, 2018/08/21
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Josef Möllers, 2018/08/21
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Josef Möllers, 2018/08/21
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Josef Möllers, 2018/08/22
- Re: [Wget-dev] wget2 | Progress bar: handle utf-8 characters in filenames (#375), Tim Rühsen, 2018/08/22