coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand


From: Eric Fischer
Subject: Multibyte support for sort, uniq, join, tr, cut, paste, expand, unexpand, fmt, fold, and pr
Date: Fri, 29 Dec 2017 00:34:21 -0800

Hello Coreutils maintainers!

I've recently spent some time adding multibyte support to the coreutils
text processing tools (sort, uniq, join, tr, cut, paste, expand, unexpand,
fmt, fold, and pr) in this repository:

    https://github.com/ericfischer/coreutils-utf8

I haven't tackled cut -bn yet, or multibyte octal escapes in tr, or figured
out whether there is an appropriate way to do multibyte case mappings in
dd, and tr probably uses too much memory, but I think all the other places
where POSIX specifies characters instead of bytes are covered.

I just learned from
http://lists.gnu.org/archive/html/bug-coreutils/2017-12/msg00017.html that
there is another ongoing multibyte project. I wish I had known about that
before duplicating effort, but at least it looks like I have touched some
areas that the other branch hasn't, so I hope my changes will still be of
some use.

Before I put any more work into cleaning up my branch, I also have a
general development question: Is it OK to make multibyte additions to the
lib directory here, or do those changes need to made in an upstream
repository or in the applications themselves?

Eric


reply via email to

[Prev in Thread] Current Thread [Next in Thread]