bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: coreutils uniq -d -u does not conform to POSIX


From: P
Subject: Re: coreutils uniq -d -u does not conform to POSIX
Date: Fri, 30 May 2003 10:33:30 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3b) Gecko/20030210

Paul Eggert wrote:
address@hidden writes:


There are other invalid combinations not handled either:
See the last 6 lines of this patch for a summary:
http://www.pixelbeat.org/patches/textutils-2.0.21-uniq-group.diff


I don't quite follow.  The last six lines seem to claim that the
following pairs of options are invalid:

-D -c
-G -c
-D -G
-D -u
-u -d
-D -d

OK forget about the -G for now, so we're left with:

-D -c #counts are redundant
-D -u #input = output (so warn user)
-D -d #-d is redundant (so warn user)
-u -d #only unique & only dups are disjoint sets => no output
      #as posix states, but should still warn user IMHO.

But -u -d is clearly valid, since POSIX says it's valid.

have you a link to the POSIX docs?

In the documentation part of the patch that I sent, we have:

(default) Discard the second and subsequent repeated lines.
-d Discard lines that are not repeated.
-u Discard the first repeated line.

This isn't as clear to me as "only print unique lines"

-D Do not discard the second and subsequent repeated lines,
   but discard lines that are not repeated.

Under this convention, -D is not incompatible with -d or with -u.

agreed, but redundant => should warn user as I summarised above

It is incompatible with -c, and the patched uniq.c checks for that.

good.

coreutils uniq does not have the -G option, but if it did it would
behave like this:

-G Do not discard the second and subsequent repeated line.

That is, -D is equivalent to -d -G.  Under this interpretation, -G is
also incompatible with -c, but it's not incompatible with any other
option.

Perhaps it would be better for coreutils uniq to drop the -D option,
and to have -G instead.  With -G, one can easily simulate -D (since -D
== -d -G), but the converse is not true.  Or if backward compatibility
is a concern, perhaps -G should be added.

All this is too complicated. IMHO uniq should just have

    --group={min,max,num_to_show,delimter}

then:

uniq default mode is same as:
    --group={,,1,}
-D is the same as
    --group={2,,,}
-d is the same as
    --group={2,,1,}
-u is the same as
    --group={1,1,,}       #warn if delimeter != none

And you can do extra things like:

    --group={,,,append}   #warn if delimiter == none

Pádraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]