coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Adding humanize_number to coreutiles?


From: Pádraig Brady
Subject: Re: Adding humanize_number to coreutiles?
Date: Tue, 14 Feb 2012 01:06:18 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0

On 02/13/2012 03:45 AM, Peng Yu wrote:
> 2012/2/7 Pádraig Brady <address@hidden>:
>> On 02/07/2012 03:36 AM, Peng Yu wrote:
>>> Hi,
>>>
>>> Several commands in coreutils have the -h option. I'm wondering
>>> whether anybody in the develop team also thinks that it is worthwhile
>>> to export it as a standalone command. If so, I'd recommend add such
>>> convenient command in coreutiles. As I don't find it anywhere else as
>>> a stand alone command.
>>>
>>> http://siarzhuk.dyndns.org/haiku/doxygen/coreutils_2lib_2human_8c_source.html#l00154
>>
>> I've needed such functionality many times.
>> I'm thinking a printf format would be best to expose this:
>> http://lists.gnu.org/archive/html/coreutils/2011-08/msg00029.html
>>
>> %H seems like it might cause compat problems in future.
>> %{human} is more descriptive and extensible, so I'm leaning towards that.
>> Any other suggestions appreciated.
>>
>> I'll work on it this week.
> 
> I'm not sure %{human} is enough for configuring all the possible ways
> of printing a humanized number. See my code below, there are binary
> and decimal humanized numbers. Also, you'd better allow a space (or
> not) between the numbers and the letters (such as 'T', 'G').
> 
> Also embedding it in printf will make it hard to be found, I'd
> recommend to create a new command like humanizenumber, just as I did.
> 
> /tmp$ cat `which humanizenumber.sh `
> #!/usr/bin/env bash
> 
> script_name=`basename "$0" .sh`
> 
> TEMP=`getopt -o hbd:sn --long
> help,binary,number_of_decimal_places:,space,newline -n
> "${script_name}.sh" -- "$@"`
> 
> if [ $? != 0 ] ; then printf "Terminating...\n" >&2 ; exit 1 ; fi
> 
> eval set -- "$TEMP"
> 
> abspath_script=`readlink -f -e "$0"`
> script_absdir=`dirname "$abspath_script"`
> 
> number_of_decimal_places=0
> while true ; do
>   case "$1" in
>     -h|--help)
>       cat "$script_absdir"/${script_name}_help.txt
>       exit
>       ;;
>     -b|--binary)
>       binary=x
>       shift
>       ;;
>     -d|--number_of_decimal_places)
>       number_of_decimal_places="$2"
>       shift 2
>       ;;
>     -s|--space)
>       space=' '
>       shift
>       ;;
>     -n|--newline)
>       newline='\n'
>       shift
>       ;;
>     --)
>       shift
>       break
>       ;;
>     *)
>       printf "Internal error!\n">&2
>       exit 1
>       ;;
>   esac
> done
> 
> if [ $# -ne 0 ]
> then
>   n="$1"
>   if [ -n "$binary" ]
>   then
>     awk -v sum=$n \
>       -v space="$space" \
>       -v newline="$newline" \
>       -v number_of_decimal_places=$number_of_decimal_places '
>     BEGIN{
>     hum[1024**5]="P"
>     hum[1024**4]="T"
>     hum[1024**3]="G"
>     hum[1024**2]="M"
>     hum[1024]="K"
>     for (x=1024**5; x>=1024; x/=1024) {
>       if (sum>=x) {
>         printf "%." number_of_decimal_places "f" space "%s" newline,
> sum/x, hum[x]
>         break
>       }
>     }
>   }'
> else
>   awk -v sum=$n \
>     -v space="$space" \
>     -v newline="$newline" \
>     -v number_of_decimal_places=$number_of_decimal_places '
>   BEGIN{
>   hum[1000**5]="P"
>   hum[1000**4]="T"
>   hum[1000**3]="G"
>   hum[1000**2]="M"
>   hum[1000]="k"
>   for (x=1000**5; x>=1000; x/=1000) {
>     if (sum>=x) {
>       printf "%." number_of_decimal_places "f" space "%s" newline, sum/x, 
> hum[x]
>       break
>     }
>   }
> }'
>   fi
> fi
> /tmp$  humanizenumber.sh -h
> Description:
>   Humanize number(s)
> 
> Usage:
>   humanizenumber.sh [Options] [NUMBER]
> 
>     NUMBER                            If not specifed, then do nothing.
> 
> Options:
>   -h|--help                           Help message.
>   -b|--binary                         Default: decimal.
>   -s|--space                          Default: nospace.
> 
> Examples:
>   humanizenumber.sh 456456456
>   humanizenumber.sh -d 2 456456456
>   humanizenumber.sh -s 456456456
>   humanizenumber.sh -b 456456456
>   humanizenumber.sh -n 456456456
> 
> Author:
>   Peng Yu <address@hidden>

Looking more at this, you might be right.
Now printf already has related formatting functionality:

$ env LANG=fa_IR.utf8 printf "%I'd\n" 1234
۱٬۲۳۴

I was thinking it would be appropriate to add "human" into the mix like

$ env LANG=fa_IR.utf8 printf "%Hd\n" 1234
1K

$ env LANG=fa_IR.utf8 printf "%HId\n" 1234
۱K

But as you say there are options for humanizing.
So would there be enough cohesive functionality one could add to such a util?
I suppose so, since one could add field processing and
multiplier support for example.

Also what to call it? humanize_number is too long I think.
Perhaps we could use a more general name. Drats I was
thinking of `numconv`, but that's taken:
http://www.unixref.com/manPages/numconv.html
Maybe `convnum`, anyway...

A tentative design could be:


convnum [OPTIONS] [NUM]...

Numbers are processed from stdin or the command options.

--from={auto,SI,IEC}
If not specified, suffixes are ignored
auto => 1K -> 1000, 1Ki -> 1024
SI => 1K* -> 1000
IEC => 1K* -> 1024

--from-unit=<NUMBER>
Specify the unit size.
--from-unit=1 is implied if not specified

--to={SI,IEC,<NUMBER>}
Auto scale the numbers to SI (powers of 1000),
or IEC (powers of 1024), so at most 3 digits are output.
Note output will be standard, without a B suffix.
I.E. 123K or 123Ki for SI and IEC respectively.
If <NUMBER> is specified use this as the scale.

--to-unit=<NUMBER>
Specify the output unit size.
--to-unit=1 is implied if not specified

--round={ceiling, floor, nearest}
--round=ceiling is implied if not specified

--number-format=FORMAT
--number-format=%d is implied if not specified
You can use this to specify a space after the number
You can also use this to perform grouping (with %'d)
You can also use this to select alternative number forms (with %Id) etc.

--suffix=SUFFIX
Example --suffix=B

--field=NUM
replace the number in the portion of the line delimited by whitespace.
If the new number is narrower, then pad to the same field width with spaces.

<NUMBER>s specified above can be numeric
with an optional suffix, like K, Ki.
(Note K should probably be SI here, unlike other coreutils).


cheers,
Pádraig



reply via email to

[Prev in Thread] Current Thread [Next in Thread]