coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: expand/unexpand: add tests, refactor common code


From: Assaf Gordon
Subject: Re: expand/unexpand: add tests, refactor common code
Date: Sat, 16 Jul 2016 21:52:08 -0400

Hello,

> On Jun 27, 2016, at 06:56, Pádraig Brady <address@hidden> wrote:
> 
> On 27/06/16 06:17, Assaf Gordon wrote:
>> Hello Pádraig and all,
>> 
>>> On Jun 25, 2016, at 07:20, Pádraig Brady <address@hidden> wrote:
>>> 
>>> As part of this, or at least before looking at multibyte changes,
>>> it would be worth considering this proposal for changing the
>>> unexpand algorithm: http://bugs.gnu.org/23335
>> 
>> The above bug-report addresses this TODO item:
>> ===
>> unexpand: [http://www.opengroup.org/onlinepubs/007908799/xcu/unexpand.html]
>>  printf 'x\t \t y\n'|unexpand -t 8,9 should print its input, unmodified.
>>  printf 'x\t \t y\n'|unexpand -t 5,8 should print "x\ty\n"
>> ===
> 
> I think the second command is wrong there actually?
> Surely it should print "x\t\t y\n"

Digging a bit deeper about various 'unexpand' implementation, it seems there 
are more differences.
Attached is a summary of most of coreutil's unexpand tests on various systems.
The trivial cases give the same results, but more tricky cases (e.g. the 
'blanks' and 'posix' tests) do differ.

The test script is here: http://files.housegordon.org/tmp/test-unexpand-2.sh
(the last 'ff' octet for AIX can be ignored, I suspect a bug in AIX's unexpand 
when lines are not '\n' terminated).

Example (the inputs are 'blank-1' and 'blank-11' from 
<coreutils>/tests/misc/unexpand.pl):

blanks-1   AIX-1                09 62 09 09 63 09 09 09 64
blanks-1   Darwin-14.4.0        20 62 09 20 63 09 09 20 64 
blanks-1   FreeBSD-10.1-RELEASE 20 62 09 20 63 09 09 20 64 
blanks-1   Linux-3.16.0-4-amd64 09 62 09 09 63 09 09 09 64
blanks-1   SunOS-5.11           20 62 20 20 63 20 20 20 64

blanks-11  AIX-1                09 09 34
blanks-11  Darwin-14.4.0        09 34 
blanks-11  FreeBSD-10.1-RELEASE 09 34 
blanks-11  Linux-3.16.0-4-amd64 09 09 34
blanks-11  SunOS-5.11           09 20 34


And so I wonder if it's best to leave unexpand's algorithm as-is, for the sake 
of backwards-compatability (if someone is expecting coreutils' expected 
behavior),
and then focus back on multibyte character processing in 'expand' (with or 
without using the refactoring patches).

Attachment: unexpand-comparison.txt.xz
Description: Binary data



regards,
 - assaf


reply via email to

[Prev in Thread] Current Thread [Next in Thread]