coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Patch] expand,unexpand multibyte support


From: Pádraig Brady
Subject: Re: [Patch] expand,unexpand multibyte support
Date: Mon, 18 Feb 2013 20:41:06 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1

On 02/18/2013 03:30 PM, Ondrej Oprala wrote:
Hi, I've been working on multibyte support for the {un,}expand utilities 
lately, my approach being similar to Padraig's from 2010 ( 
http://lists.gnu.org/archive/html/coreutils/2010-09/msg00029.html ) . Both 
tools now read by lines, not bytes, and then iterate over the characters 
properly.

Since both tools share huge amounts of code, I've created an expand-core.c file 
to hold it.

I've also noticed that if you add libunistring to bootstrap.conf's list of 
modules, libcoreutils.a will have problems compiling, hence the gnulib patch.

I was planning on doing cut next, if this gets accepted well.

Thanks for working on this!
The general approach looks good.

Since tabs are used for alignment, the width
of space and non spaces are significant.

For example ideographic space is 2 wide.
So augmenting the tests with something like
ensuring the following is aligned would be good:

env printf '12345678
e\t|ascii(1)
\u00E9\t|composed(1)
e\u0301\t|decomposed(1)
\u3000\t|ideo-space(2)
\uFF0D\t|full-hypen(2)
' | expand

I'll try to review fully over the next while.

thanks,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]