coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: BUG in sort --numeric-sort --unique


From: carl hansen
Subject: Re: BUG in sort --numeric-sort --unique
Date: Thu, 13 Feb 2020 15:23:34 -0800

try :   sort --debug
Probably not a bug. info sort says

-n’
‘--numeric-sort’
‘--sort=numeric’
Sort numerically.  The number begins each line and consists of
optional blanks, an optional ‘-’ sign, and zero or more digits
possibly separated by thousands separators, optionally followed by
a decimal-point character and zero or more digits.

probably when it sees 1.2.3.4 it interprets it as  1.2  and  1.2.88.99
is also 1.2 , and -u tosses it.

On Thu, Feb 13, 2020 at 3:06 PM Stefano Pederzani
<address@hidden> wrote:
>
> Hello.
> The bug is in using "sort -nu" in a pipe after the output of IP
> addresses list. Every line is only something like "1.2.3.4".
>
> The problem is the same on these two different distributions:
> 1) Linux li302-235 5.1.17-x86_64-linode128 #1 SMP PREEMPT Wed Jul 10
> 17:11:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux (is an Ubuntu
> 2) Linux pepe.mi.bo.it 2.6.32-754.25.1.el6.x86_64 #1 SMP Mon Dec 23
> 15:19:53 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux (is a CentOS 6.10)
>
> Given a list of IP addresses the command lines:
> # cat controllareARCHIVIO_2020/02/controllare20200213.txt | wc -l
> 1264
> That's the number of lines without sort
>
> # cat controllareARCHIVIO_2020/02/controllare20200213.txt | sort -u | wc -l
> 1262
> That's the number of unique lines with sort --unique
>
> # cat controllareARCHIVIO_2020/02/controllare20200213.txt | sort -nu | wc -l
> 685
> That IS NOT the number of unique lines! Why ordering them numerically
> should change the number?
>
> In fact, separating the parameters:
> # cat controllareARCHIVIO_2020/02/controllare20200213.txt | sort -u |
> sort -n | wc -l
> 1262
> we workaround the bug.
>
> I did not find any report of this on
> https://lists.gnu.org/archive/html/bug-coreutils/
> so I wrote.
>
> I am available for further explication.
>
> Thanks in advance,
> Best Greetings
>
> --
>
> STEFANO PEDERZANI
>
> Amministratore di Sistemi Informatici
> System Administrator
> Amministratore di Database
> Database Administrator
>
> Email:
> address@hidden
> Tel. +39 347 1645440
>
> www.icomeinformatica.com
>
>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]