[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: coreutils uniq -d -u does not conform to POSIX
From: |
P |
Subject: |
Re: coreutils uniq -d -u does not conform to POSIX |
Date: |
Fri, 30 May 2003 10:33:30 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3b) Gecko/20030210 |
Paul Eggert wrote:
address@hidden writes:
There are other invalid combinations not handled either:
See the last 6 lines of this patch for a summary:
http://www.pixelbeat.org/patches/textutils-2.0.21-uniq-group.diff
I don't quite follow. The last six lines seem to claim that the
following pairs of options are invalid:
-D -c
-G -c
-D -G
-D -u
-u -d
-D -d
OK forget about the -G for now, so we're left with:
-D -c #counts are redundant
-D -u #input = output (so warn user)
-D -d #-d is redundant (so warn user)
-u -d #only unique & only dups are disjoint sets => no output
#as posix states, but should still warn user IMHO.
But -u -d is clearly valid, since POSIX says it's valid.
have you a link to the POSIX docs?
In the documentation part of the patch that I sent, we have:
(default) Discard the second and subsequent repeated lines.
-d Discard lines that are not repeated.
-u Discard the first repeated line.
This isn't as clear to me as "only print unique lines"
-D Do not discard the second and subsequent repeated lines,
but discard lines that are not repeated.
Under this convention, -D is not incompatible with -d or with -u.
agreed, but redundant => should warn user as I summarised above
It is incompatible with -c, and the patched uniq.c checks for that.
good.
coreutils uniq does not have the -G option, but if it did it would
behave like this:
-G Do not discard the second and subsequent repeated line.
That is, -D is equivalent to -d -G. Under this interpretation, -G is
also incompatible with -c, but it's not incompatible with any other
option.
Perhaps it would be better for coreutils uniq to drop the -D option,
and to have -G instead. With -G, one can easily simulate -D (since -D
== -d -G), but the converse is not true. Or if backward compatibility
is a concern, perhaps -G should be added.
All this is too complicated. IMHO uniq should just have
--group={min,max,num_to_show,delimter}
then:
uniq default mode is same as:
--group={,,1,}
-D is the same as
--group={2,,,}
-d is the same as
--group={2,,1,}
-u is the same as
--group={1,1,,} #warn if delimeter != none
And you can do extra things like:
--group={,,,append} #warn if delimiter == none
Pádraig.