[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to sort unicode properly?
From: |
Eric Blake |
Subject: |
Re: How to sort unicode properly? |
Date: |
Wed, 25 Sep 2019 10:27:58 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0 |
On 9/25/19 10:20 AM, Peng Yu wrote:
Hi,
It seems that "café" should be sorted before "caff" in Unicode.
https://github.com/jtauber/pyuca
But `sort` does not do so.
$ printf '%s\n' cafe caff café | LC_ALL=UTF8 sort
cafe
caff
café
$ printf '%s\n' cafe caff café | LC_ALL=en_US.UTF-8 sort
cafe
caff
café
How to make `sort` sort according to Unicode order? Thanks.
You'll have to write a locale definition where strcoll() sorts in the
order you want. Coreutils sort is calling strcoll(), and if it doesn't
sort the way you think it should, the bug is in your locale and not in
coreutils. You'll want to report this issue to whoever provided your
en_US.UTF-8 locale (perhaps glibc?)
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
- How to sort unicode properly?, Peng Yu, 2019/09/25
- Re: How to sort unicode properly?,
Eric Blake <=
- Re: How to sort unicode properly?, Peng Yu, 2019/09/25
- Re: How to sort unicode properly?, Eric Fischer, 2019/09/25
- Re: How to sort unicode properly?, Eric Blake, 2019/09/25
- Re: How to sort unicode properly?, Peng Yu, 2019/09/25
- Re: How to sort unicode properly?, Eric Blake, 2019/09/25
- Re: How to sort unicode properly?, Lion Yang, 2019/09/25