bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk array index: string vs integer


From: Ed Morton
Subject: Re: gawk array index: string vs integer
Date: Tue, 12 Dec 2023 05:21:29 -0600
User-agent: Mozilla Thunderbird

Running

   $ gawk --version
   GNU Awk 5.3.0, API 4.0, PMA Avon 8-g1, (GNU MPFR 4.2.1, GNU MP 6.3.0)

in

   $ bash --version
   GNU bash, version 5.2.15(3)-release (x86_64-pc-cygwin)

on cygwin, here are the test scripts running push()/pop() on a stack 10 million times each demonstrating the roughly 3X time difference:

---------------
$ head tst_stack.awk tst_length.awk tst_numeric.awk tst_string.awk
==> tst_stack.awk <==
BEGIN {
    delete x
    for ( i=1; i<=n; i++ ) { push(x,1) }
    for ( i=1; i<=n; i++ ) { pop(x)    }
    print length(x)
}

==> tst_length.awk <==
function push(x,y) { x[length(x)+1] = y  }
function pop(x)    { delete x[length(x)] }

==> tst_numeric.awk <==
function push(x,y) { x[++x[0]] = y    }
function pop(x)    { delete x[x[0]--] }

==> tst_string.awk <==
function push(x,y) { x[++x["tos"]] = y    }
function pop(x)    { delete x[x["tos"]--] }
-----------------

and the `time` output for each:

-----------
$ time awk -v n=10000000 -f tst_length.awk -f tst_stack.awk
0

real    0m4.816s
user    0m4.718s
sys     0m0.015s

$ time awk -v n=10000000 -f tst_numeric.awk -f tst_stack.awk
1

real    0m4.824s
user    0m4.718s
sys     0m0.015s

$ time awk -v n=10000000 -f tst_string.awk -f tst_stack.awk
1

real    0m14.748s
user    0m12.046s
sys     0m2.656s
-----------

Regards,

    Ed.

On 12/11/2023 11:55 PM, J Naman wrote:
Recently, Ed wrote,"Using a string `"tos"` instead of a number `0` for the
top of a stack index makes the script run about 3 times slower."
When I benchmarked it, I consistently got a 5% difference, which is slower,
but not 3x.
Process: I split()a string of numbers to an array (thus indexed
1->n),copied that array, and set array1[0]=an integer and array2["tos"]=
the same integer
Looping 80 million times, the difference was 5%. Different loop counts were
5% too.
Presumably the 3x difference was true in some older gawk versions, but it
does not seem to be now. Or I got it wrong ... -john Naman


reply via email to

[Prev in Thread] Current Thread [Next in Thread]