[gawk-diffs] [SCM] gawk branch, long-double, updated. 3a4e6e9c74d0ba859b

gawk-diffs
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[gawk-diffs] [SCM] gawk branch, long-double, updated. 3a4e6e9c74d0ba859b

From:	John Haque
Subject:	[gawk-diffs] [SCM] gawk branch, long-double, updated. 3a4e6e9c74d0ba859b56ba066850dbba3b15eb22
Date:	Sat, 09 Feb 2013 00:55:49 +0000
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "gawk".

The branch, long-double has been updated
       via  3a4e6e9c74d0ba859b56ba066850dbba3b15eb22 (commit)
      from  3e6f4db8c8724164a52b12216746c6bb3e050b69 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
http://git.sv.gnu.org/cgit/gawk.git/commit/?id=3a4e6e9c74d0ba859b56ba066850dbba3b15eb22

commit 3a4e6e9c74d0ba859b56ba066850dbba3b15eb22
Author: John Haque <address@hidden>
Date:   Fri Feb 8 18:54:16 2013 -0600

    Added fmodl replacement, updated sinl and cosl to handle huge arguments.

diff --git a/TODO.LDBL b/TODO.LDBL
index f0bf2c9..61f073a 100644
--- a/TODO.LDBL
+++ b/TODO.LDBL
@@ -1,17 +1,27 @@
-* Don't trust compiler with type promotion:
-       **** If an operation involves two operands, and one of them is of type 
long double,
-       **** the other one is converted to long double. 
-The latest version of GCC I have (4.4.3) seems to have different ideas. Start 
using cast
-to long double even when it seems unnecessary address@hidden@#
-
-Also, append suffix L to all long double constants.
-
-* Convert gawk math routines to C for platforms lacking any. Without ceil() 
and floor()
-for long doubles, these can't even get 64 bit integers!
-       -- finish fp_fmod() in misc/fp_math.awk.
-
-* Don't use adjust_uint(uintmax_t n) from floatcomp.c, it is for AWKNUM. What 
is the point
-of floor() and ceil() wrappers? Don't have a clue.
-       -- DONE. Not sure if it is necessary for any long double and uintmax_t
-       combination found in the wild. Looks like ceil() and floor() wrappers
-       are for VMS; one probably should update comments in floatcomp.c.
+* Replacement math routines. - DONE.
+   Update awk version fp_math.awk for the changes in C code ?
+   FUTURE: Improvemnet/Optimization.
+     [1] Use truncated Taylor series and/or approximating polynomials.
+         It would require at least two versions (80-bit and 128-bit).
+     [2] Horner's rule to evaluate truncated series.
+     [3] Payne and Hanek Reduction Algorithm to provide more sig digs
+         for sin and cos with large arguments. It most likely involves more
+         than just extending the PI table to few thousand entries.
+         Pay particular attention to the case when x is almost an integer
+         multiple of PI/2. For a clean-room implementation, consider the
+         feasibility of using (64-bit) integer arithmetic.
+         References:
+         * Elementary Functions: Algorithms and Implementation. Jean-Michel 
Muller,
+           BirkhÃ¤user Boston.
+        * Handbook of Floating-Point Arithmetic. Jean-Michel Muller et. el.,
+           BirkhÃ¤user Boston.
+         * Argument Reduction for Huge Arguments: Good to the Last Bit. K. C. 
Ng,
+           SunPro, Sun Microsystems, Inc. Business.
+           Original fdmlib implementation copied by most libm.
+       
+*  Don't use adjust_uint(uintmax_t n) from floatcomp.c, it is for AWKNUM.
+   What is the point of floor() and ceil() wrappers? Don't have a clue.
+   - DONE. Not sure if it is necessary for any long double and uintmax_t
+          combination found in the wild. Looks like ceil() and floor()
+           wrappers are for VMS; One probably should update comments
+           in floatcomp.c.
diff --git a/long_double.c b/long_double.c
index 190efba..d604893 100644
--- a/long_double.c
+++ b/long_double.c
@@ -44,22 +44,24 @@
  * XXX: LDBL_MANT_DIG is defined if we are here. Assume FLT_RADIX = 2 or 16.
  */
 #if FLT_RADIX == 2
-#define LDBL_FRAC_BITS LDBL_MANT_DIG
+#define GAWK_LDBL_FRAC_BITS    LDBL_MANT_DIG
 #else
-#define LDBL_FRAC_BITS (4 * LDBL_MANT_DIG)
+#define GAWK_LDBL_FRAC_BITS    (4 * LDBL_MANT_DIG)
 #endif
 
+#define GAWK_LDBL_MAX_EXP      LDBL_MAX_EXP
+
 /*
  * N.B: If printf does not have "%Lf" format and the long double type is 
capable of
  * supporting integers wider than 64 bits, must have 64-bit long type.
  *
- * We actually use a maximum of 113 bits when LDBL_INT_BITS is 128.
+ * We actually use a maximum of 113 bits when GAWK_LDBL_INT_BITS is 128.
  */
 
-#if SIZEOF_GAWK_INT == 8 && LDBL_FRAC_BITS > 64
-#define        LDBL_INT_BITS   128
+#if SIZEOF_GAWK_INT == 8 && GAWK_LDBL_FRAC_BITS > 64
+#define        GAWK_LDBL_INT_BITS      128
 #else
-#define        LDBL_INT_BITS   64
+#define        GAWK_LDBL_INT_BITS      64
 #endif
 
 #define get_long_double(d)     getblock(d, BLOCK_LDBL, AWKLDBL *)
diff --git a/long_double.h b/long_double.h
index 2b8e22d..a509c15 100644
--- a/long_double.h
+++ b/long_double.h
@@ -397,7 +397,7 @@ static void
 awkldbl_init_vars()
 {
        unref(PREC_node->var_value);
-        PREC_node->var_value = make_awknum(LDBL_FRAC_BITS);
+        PREC_node->var_value = make_awknum(GAWK_LDBL_FRAC_BITS);
        PREC_node->var_value->flags |= NUMINT;
         unref(ROUNDMODE_node->var_value);
         ROUNDMODE_node->var_value = make_string("N", 1);
@@ -468,8 +468,8 @@ make_integer(uintmax_t n)
 
        /* XXX: is this really needed in this case ??? */
        
-       if (LDBL_FRAC_BITS < CHAR_BIT * sizeof (n)) {
-               int i = CHAR_BIT * sizeof (n) - LDBL_FRAC_BITS;
+       if (GAWK_LDBL_FRAC_BITS < CHAR_BIT * sizeof (n)) {
+               int i = CHAR_BIT * sizeof (n) - GAWK_LDBL_FRAC_BITS;
 
                /* strip leading `i' bits */
 
@@ -802,7 +802,7 @@ do_rand(int nargs ATTRIBUTE_UNUSED)
         *
         *      0 <= n < 1
         */
-       return make_awkldbl((random() % GAWK_RANDOM_MAX) / GAWK_RANDOM_MAX);
+       return make_awkldbl((AWKLDBL) (random() % GAWK_RANDOM_MAX) / 
GAWK_RANDOM_MAX);
 }
 
 /* do_srand --- seed the random number generator */
@@ -1663,7 +1663,7 @@ gawk_floorl_finite_p(AWKLDBL x, gawk_uint_t *chunk)
        int high, low, mid;
        AWKLDBL intval = LDC(0.0);
 
-#if    LDBL_INT_BITS == 128
+#if    GAWK_LDBL_INT_BITS == 128
        if (x >= pow2ld(113))
 #else
        if (x >= pow2ld(64))
@@ -1674,7 +1674,7 @@ gawk_floorl_finite_p(AWKLDBL x, gawk_uint_t *chunk)
                memset(chunk, '\0', 4 * sizeof (gawk_uint_t));
 
        /* binary search */
-       high = LDBL_INT_BITS - 1;
+       high = GAWK_LDBL_INT_BITS - 1;
        while (x >= LDC(2.0)) {
                low = 0;
                while (low <= high) {
@@ -1696,7 +1696,7 @@ gawk_floorl_finite_p(AWKLDBL x, gawk_uint_t *chunk)
                                 *      |<------- x (64/128 bits) ----->|
                                 */
 
-#if    LDBL_INT_BITS == 128
+#if    GAWK_LDBL_INT_BITS == 128
                                if (low <= 32) chunk[0] += (gawk_uint_t) 
pow2ld(low - 1);
                                else if (low <= 64) chunk[1] += (gawk_uint_t) 
pow2ld(low - 33);
                                else if (low <= 96) chunk[2] += (gawk_uint_t) 
pow2ld(low - 65);
@@ -1741,10 +1741,10 @@ format_uint_finite_p(char *str, size_t size, AWKLDBL x)
         *  URL: homepage.cs.uiowa.edu/~jones/bcd/decimal.html
         */
 
-#if    LDBL_INT_BITS == 128
+#if    GAWK_LDBL_INT_BITS == 128
        static gawk_uint_t coeff[] = { 
                1, 4967296, 9551616,
-#if defined(TEST_NUMBR) && TEST_NUMBR == 1
+#ifdef GAWK_INT_IS_LONG_LONG
                3585223950336ULL,
 #else
                3585223950336UL,
diff --git a/misc/ChangeLog b/misc/ChangeLog
index d668726..0e38871 100644
--- a/misc/ChangeLog
+++ b/misc/ChangeLog
@@ -1,3 +1,10 @@
+013-02-08         John Haque            <address@hidden>
+
+       * gawk_math.c (gawk_fmodl): Add new routine.
+       (gawk_sinl, gawk_cosl): Update to handle huge arguments.
+       * rem_pio2.c: New file.
+       * ldbl-tests/atan2.awk, ldbl-tests/fmod.awk: New tests for the math 
functions.
+
 2013-02-04         John Haque            <address@hidden>
 
        * gawk_math.c (gawk_sinl, gawk_cosl, gawk_atan2l):
diff --git a/misc/Makefile b/misc/Makefile
index 2a07457..e2a1bb1 100644
--- a/misc/Makefile
+++ b/misc/Makefile
@@ -3,10 +3,13 @@ LIBSTATIC = libmisc.a
 CC = gcc
 LD = gcc
 AR = ar
-AWKPROG = ../gawk
 CMP = cmp
 
-AWK = LC_ALL=$${GAWKLOCALE:-C} LANG=$${GAWKLOCALE:-C} $(AWKPROG)
+srcdir = .
+top_srcdir = ..
+
+AWKPROG = $(top_srcdir)/gawk
+AWK = LC_ALL=C LANG=C $(AWKPROG)
 
 ALL_CFLAGS = $(CFLAGS) -fPIC -DGAWK -DHAVE_CONFIG_H -c -I.. -I.
 
@@ -18,9 +21,9 @@ $(LIBSTATIC): $(OBJS)
        $(AR) rc $(LIBSTATIC) $(OBJS)
        ranlib $(LIBSTATIC)
 
-float128.o: float128.c ../awk.h ../long_double.h ./gawk_math.h ./gawk_math.c
+float128.o: float128.c $(top_srcdir)/awk.h $(top_srcdir)/long_double.h 
$(srcdir)/gawk_math.h $(srcdir)/gawk_math.c $(srcdir)/rem_pio2.c
 
-float80.o: float80.c ../awk.h ../long_double.h ./gawk_math.h ./gawk_math.c
+float80.o: float80.c $(top_srcdir)/awk.h $(top_srcdir)/long_double.h 
$(srcdir)/gawk_math.h $(srcdir)/gawk_math.c $(srcdir)/rem_pio2.c
 
 .c.o:
        $(CC) $(ALL_CFLAGS) $< -o $@
@@ -46,10 +49,13 @@ LDBL_TESTS = \
        sin64 \
        sin113 \
        cos64 \
-       cos113
+       cos113 \
+       atan64 \
+       atan113 \
+       fmod64 \
+       fmod113
 
 INFILE = ldbl-tests/data.in
-INFILE32 = ldbl-tests/sincos.in
 
 # An attempt to print something that can be grepped for in build logs
 pass-fail:
@@ -66,10 +72,6 @@ ldbl-tests: $(LDBL_TESTS)
 $(INFILE):
        @$(AWK) -M -vPREC=quad -f ldblin.awk > $(INFILE) 2>&1
 
-$(INFILE32): $(INFILE)
-       @$(AWK) -M -vPREC=quad -f ldblin.awk | \
-$(AWK) -M -vPREC=quad '($$1 > -2^32 && $$1 < 2^32){ print $$0 }' > $(INFILE32) 
2>&1
-
 ldblint64:
        @echo $@
        @$(AWK) -B0 -f ldbl-tests/address@hidden > ldbl-tests/_$@ 2>&1
@@ -128,26 +130,51 @@ pow113: $(INFILE)
        @$(AWK) -M -vDIG=32 -vPREC=113 -f ldbl-tests/pow.awk $(INFILE) > 
ldbl-tests/address@hidden 2>&1
        @-$(AWK) -M -vTOL=20 -f floatcmp.awk ldbl-tests/_$@ 
ldbl-tests/address@hidden && rm ldbl-tests/_$@
 
-sin64:
+sin64: $(INFILE)
+       @echo $@
+       @$(AWK) -B0 -vDIG=17 -f ldbl-tests/sin.awk $(INFILE) > ldbl-tests/_$@ 
2>&1
+       @$(AWK) -M -vDIG=17 -vPREC=64 -f ldbl-tests/sin.awk $(INFILE) > 
ldbl-tests/address@hidden 2>&1
+       @-$(AWK) -M -f floatcmp.awk ldbl-tests/_$@ ldbl-tests/address@hidden && 
rm ldbl-tests/_$@
+
+sin113: $(INFILE)
+       @echo $@
+       @$(AWK) -B1 -vDIG=32 -f ldbl-tests/sin.awk $(INFILE) > ldbl-tests/_$@ 
2>&1
+       @$(AWK) -M -vDIG=32 -vPREC=113 -f ldbl-tests/sin.awk $(INFILE) > 
ldbl-tests/address@hidden 2>&1
+       @-$(AWK) -M -f floatcmp.awk ldbl-tests/_$@ ldbl-tests/address@hidden && 
rm ldbl-tests/_$@
+
+cos64: $(INFILE)
+       @echo $@
+       @$(AWK) -B0 -vDIG=17 -f ldbl-tests/cos.awk $(INFILE) > ldbl-tests/_$@ 
2>&1
+       @$(AWK) -M -vDIG=17 -vPREC=64 -f ldbl-tests/cos.awk $(INFILE) > 
ldbl-tests/address@hidden 2>&1
+       @-$(AWK) -M -f floatcmp.awk ldbl-tests/_$@ ldbl-tests/address@hidden && 
rm ldbl-tests/_$@
+
+cos113: $(INFILE)
+       @echo $@
+       @$(AWK) -B1 -vDIG=32 -f ldbl-tests/cos.awk $(INFILE) > ldbl-tests/_$@ 
2>&1
+       @$(AWK) -M -vDIG=32 -vPREC=113 -f ldbl-tests/cos.awk $(INFILE) > 
ldbl-tests/address@hidden 2>&1
+       @-$(AWK) -M -f floatcmp.awk ldbl-tests/_$@ ldbl-tests/address@hidden && 
rm ldbl-tests/_$@
+
+atan64: $(INFILE)
        @echo $@
-       @$(AWK) -B0 -vDIG=17 -f ldbl-tests/sin.awk $(INFILE32) > ldbl-tests/_$@ 
2>&1
-       @$(AWK) -M -vDIG=17 -vPREC=64 -f ldbl-tests/sin.awk $(INFILE32) > 
ldbl-tests/address@hidden 2>&1
+       @$(AWK) -B0 -vDIG=17 -f ldbl-tests/atan2.awk $(INFILE) > ldbl-tests/_$@ 
2>&1
+       @$(AWK) -M -vDIG=17 -vPREC=64 -f ldbl-tests/atan2.awk $(INFILE) > 
ldbl-tests/address@hidden 2>&1
        @-$(AWK) -M -f floatcmp.awk ldbl-tests/_$@ ldbl-tests/address@hidden && 
rm ldbl-tests/_$@
 
-sin113: $(INFILE32)
+atan113: $(INFILE)
        @echo $@
-       @$(AWK) -B1 -vDIG=32 -f ldbl-tests/sin.awk $(INFILE32) > ldbl-tests/_$@ 
2>&1
-       @$(AWK) -M -vDIG=32 -vPREC=113 -f ldbl-tests/sin.awk $(INFILE32) > 
ldbl-tests/address@hidden 2>&1
+       @$(AWK) -B1 -vDIG=32 -f ldbl-tests/atan2.awk $(INFILE) > ldbl-tests/_$@ 
2>&1
+       @$(AWK) -M -vDIG=32 -vPREC=113 -f ldbl-tests/atan2.awk $(INFILE) > 
ldbl-tests/address@hidden 2>&1
        @-$(AWK) -M -f floatcmp.awk ldbl-tests/_$@ ldbl-tests/address@hidden && 
rm ldbl-tests/_$@
 
-cos64:
+fmod64: $(INFILE)
        @echo $@
-       @$(AWK) -B0 -vDIG=17 -f ldbl-tests/cos.awk $(INFILE32) > ldbl-tests/_$@ 
2>&1
-       @$(AWK) -M -vDIG=17 -vPREC=64 -f ldbl-tests/cos.awk $(INFILE32) > 
ldbl-tests/address@hidden 2>&1
+       @$(AWK) -B0 -vDIG=17 -f ldbl-tests/fmod.awk $(INFILE) > ldbl-tests/_$@ 
2>&1
+       @$(AWK) -M -vDIG=17 -vPREC=64 -f ldbl-tests/fmod.awk $(INFILE) > 
ldbl-tests/address@hidden 2>&1
        @-$(AWK) -M -f floatcmp.awk ldbl-tests/_$@ ldbl-tests/address@hidden && 
rm ldbl-tests/_$@
 
-cos113: $(INFILE32)
+fmod113: $(INFILE)
        @echo $@
-       @$(AWK) -B1 -vDIG=32 -f ldbl-tests/cos.awk $(INFILE32) > ldbl-tests/_$@ 
2>&1
-       @$(AWK) -M -vDIG=32 -vPREC=113 -f ldbl-tests/cos.awk $(INFILE32) > 
ldbl-tests/address@hidden 2>&1
+       @echo 'This may take a while.'
+       @$(AWK) -B1 -vDIG=32 -f ldbl-tests/fmod.awk $(INFILE) > ldbl-tests/_$@ 
2>&1
+       @$(AWK) -M -vDIG=32 -vPREC=113 -f ldbl-tests/fmod.awk $(INFILE) > 
ldbl-tests/address@hidden 2>&1
        @-$(AWK) -M -f floatcmp.awk ldbl-tests/_$@ ldbl-tests/address@hidden && 
rm ldbl-tests/_$@
diff --git a/misc/float128.c b/misc/float128.c
index ac66520..d5fefab 100644
--- a/misc/float128.c
+++ b/misc/float128.c
@@ -51,53 +51,55 @@
  *     https://github.com/mirrors/gcc/blob/master/libquadmath/quadmath.h
  */
 
-#define FLT128_MAX 1.18973149535723176508575932662800702e4932Q
-#define FLT128_MIN 3.36210314311209350626267781732175260e-4932Q
-#define FLT128_EPSILON 1.92592994438723585305597794258492732e-34Q
-#define FLT128_DENORM_MIN 6.475175119438025110924438958227646552e-4966Q
-#define FLT128_MANT_DIG 113
-#define FLT128_MIN_EXP (-16381)
-#define FLT128_MAX_EXP 16384
-#define FLT128_DIG 33
-#define FLT128_MIN_10_EXP (-4931)
-#define FLT128_MAX_10_EXP 4932
+#define        FLT128_MAX      1.18973149535723176508575932662800702e4932Q
+#define        FLT128_MIN      3.36210314311209350626267781732175260e-4932Q
+#define        FLT128_EPSILON  1.92592994438723585305597794258492732e-34Q
+#define        FLT128_DENORM_MIN       
6.475175119438025110924438958227646552e-4966Q
+#define        FLT128_MANT_DIG 113
+#define        FLT128_MIN_EXP  (-16381)
+#define        FLT128_MAX_EXP  16384
+#define        FLT128_DIG      33
+#define        FLT128_MIN_10_EXP       (-4931)
+#define        FLT128_MAX_10_EXP       4932
 
 /* #define HUGE_VALQ __builtin_huge_valq() */
 /* The following alternative is valid, but brings the warning:
    (floating constant exceeds range of â__float128â)  */
-#define HUGE_VALQ (__extension__ 0x1.0p32767Q)
+#define        HUGE_VALQ (__extension__ 0x1.0p32767Q)
 
-#define IEEE_FLOAT128_BIAS 0x3fff
+#define        IEEE_FLOAT128_BIAS 0x3fff
 
 /* end of Macros from quadmath.h */
 
 
-#define LDBL_FRAC_BITS FLT128_MANT_DIG
-#define        LDBL_INT_BITS   128
+#define        GAWK_LDBL_FRAC_BITS     FLT128_MANT_DIG
+#define        GAWK_LDBL_MAX_EXP       FLT128_MAX_EXP
+#define        GAWK_LDBL_INT_BITS      128
 
 #define AWKLDBL        __float128
-#define LDBL_VAL(n)    (*((AWKLDBL *) (n)->qnumbr))
-#define LDC(x)         x##Q
+#define        LDBL_VAL(n)     (*((AWKLDBL *) (n)->qnumbr))
+#define        LDC(x)          x##Q
 
-#define get_long_double(d)     emalloc(d, void *, sizeof(AWKLDBL), "float128")
-#define free_long_double(d)    efree(d)
+#define        get_long_double(d)      emalloc(d, void *, sizeof(AWKLDBL), 
"float128")
+#define        free_long_double(d)     efree(d)
 
 /* we want to format integers ourself */
-#define GAWK_FMT_INT 1
-
-#define gawk_int_t long long
-#define gawk_uint_t unsigned long long
-#ifdef SIZEOF_GAWK_INT
-#undef SIZEOF_GAWK_INT
-#undef GAWK_INT_MAX
-#undef GAWK_INT_MIN
-#undef GAWK_UINT_MAX
+#define        GAWK_FMT_INT 1
+
+#define        gawk_int_t long long
+#define        gawk_uint_t unsigned long long
+#ifdef SIZEOF_GAWK_INT
+#undef SIZEOF_GAWK_INT
+#undef GAWK_INT_MAX
+#undef GAWK_INT_MIN
+#undef GAWK_UINT_MAX
 #endif
 
 #define SIZEOF_GAWK_INT        8
 #define        GAWK_INT_MAX    LLONG_MAX
 #define        GAWK_INT_MIN    LLONG_MIN
 #define        GAWK_UINT_MAX   ULLONG_MAX
+#define GAWK_INT_IS_LONG_LONG  1
 
 static int format_uint_finite_p(char *str, size_t size, AWKLDBL x);
 static AWKLDBL gawk_floorl_finite_p(AWKLDBL x, gawk_uint_t *chunk);
@@ -148,8 +150,6 @@ numbr_handler_t float128_hndlr;
 
 #define awkldbl_hndlr float128_hndlr
 
-#define TEST_NUMBR 1
-
 #include "misc/gawk_math.h"
 #include "long_double.h"
 #include "misc/gawk_math.c"
@@ -344,10 +344,8 @@ float128_to_hex(__float128 x)
        return buf;
 }
 
-/*
- * format_float_1 --- format a single AWKLDBL value according to FORMAT.
- *     The value must be finite.
- */
+
+/* format_float_1 --- format a single AWKLDBL value according to FORMAT. */
 
 static int
 format_float_1(char *str, size_t size, const char *format, int fw, int prec, 
AWKLDBL x)
diff --git a/misc/float80.c b/misc/float80.c
index 8d2a808..263b6a6 100644
--- a/misc/float80.c
+++ b/misc/float80.c
@@ -40,8 +40,9 @@
 #define FLT_RADIX 2
 #endif
 
-#define LDBL_FRAC_BITS LDBL_MANT_DIG
-#define        LDBL_INT_BITS   64
+#define GAWK_LDBL_FRAC_BITS    LDBL_MANT_DIG
+#define GAWK_LDBL_MAX_EXP      LDBL_MAX_EXP
+#define        GAWK_LDBL_INT_BITS      64
 
 #define get_long_double(d)     emalloc(d, void *, sizeof(AWKLDBL), "float80")
 #define free_long_double(d)    efree(d)
diff --git a/misc/fp_math.awk b/misc/fp_math.awk
index 60048b1..1c3bfa5 100644
--- a/misc/fp_math.awk
+++ b/misc/fp_math.awk
@@ -1,8 +1,7 @@
 # fp_math.awk --- finite precision math functions
 
 #
-#      TODO:   * finish fmod().
-#              * replace all usages of %, and int() or any other math builtin; 
and() is ok!.
+#      TODO:   * replace all usages of %, and int() or any other math builtin; 
and() is ok!.
 #                implement double_to_int() and remove floor/ceil.
 #
 
@@ -409,7 +408,7 @@ function frexpl(x, e, \
 
        low = 0
        if (x > 2) {
-               high = LDBL_MAX_EXP - 1         # XXX: should be 4 * 
LDBL_MAX_EXP - 1 if FLT_RADIX = 16
+               high = LDBL_MAX_EXP - 1
                while (low <= high) {
                        mid = int ((low + high) / 2)
                        y = x / pow2(mid)
@@ -421,7 +420,7 @@ function frexpl(x, e, \
                x /= pow2(low)
                e[0] = low
        } else if (x < 1) {
-               high = LDBL_MAX_EXP - 1         # could be -LDBL_MIN_EXP, but 
no harm in using LDBL_MAX_EXP
+               high = LDBL_MAX_EXP - 1
                while (low <= high) {
                        mid = int ((low + high) / 2)
                        y =  x * pow2(mid)
@@ -443,18 +442,6 @@ function frexpl(x, e, \
 }
 
 
-#
-# $ ./gawk 'BEGIN { printf("%.18e\n", log(7.12111e900))}'
-# inf
-# $ ./gawk -B 'BEGIN { printf("%.18e\n", log(7.12111e900))}'           # with 
logl(), without it inf
-# 2.074289647306790437e+03
-# $ ./gawk -B -f misc/fp_math.awk -e 'BEGIN { printf("%.18e\n", 
fp_log(7.12111e900))}'
-# 2.074289647306790437e+03
-# $ ./gawk -M -vPREC=quad 'BEGIN { printf("%.18e\n", log(7.12111e900))}'
-# 2.074289647306790438e+03
-#
-
-
 function fp_log(x,     \
                e, exponent, i, ypow2, ypow_odd, sum, err, term, sign)
 {
@@ -713,60 +700,8 @@ function fp_ceil(x,        d)
        return d + 1
 }
 
-#
-#      Calculating fmod with correctly rounded output isn't easy:
-#
-# BEGIN {
-#      POW2[0] = 1
-#      for (i = 1; i <= 64; i++) {
-#              POW2[i] = POW2[i - 1] * 2
-#      }
-#      x = 343.4341
-#      y = 1.324355
-#      printf("builtin: %0.18f\n", x % y)
-#
-#      q = int(x / y)  # floor
-#      printf("simple : %0.18f\n", x - q * y)
-#
-#      r = calc_fmod(x, y)
-#      printf("pow2   : %0.18f\n", r)
-# }
-#
-# function calc_fmod(x, y,     q, j)
-# {
-#      # x > 0, y > 0
-#      q = int(x / y)  # floor, q < 2^64
-#      while (q > 2) {
-#              # XXX: use binary search?
-#              for (j = 1; j <= 64; j++) {
-#                      if (q <= POW2[j]) {
-#                              x -= POW2[j - 1] * y
-#                              q -= POW2[j - 1]
-#                              break
-#                      }
-#              }
-#      }
-#      x -= q * y
-#      return x
-# }
-#
-# We can only get correctly rounded 16 digits (with 80 bit long double),
-# using math library or not:
-#
-# $ ./gawk -B -f fmod.awk 
-# builtin: 0.426155000000000019
-# simple : 0.426155000000000006
-# pow2   : 0.426155000000000019
-#
-# For comparison, MPFR quad precision output:
-#
-# $ ./gawk -M -vPREC=quad 'BEGIN { printf("%0.18f\n",  343.4341 % 1.324355) }'
-# 0.426155000000000000
-#
-
-
 function fp_fmod(x, y, \
-               z, sign)
+               zx, ex, zy, ey, exy, signx, txy, ty)
 {
        x = +x
        y = +y
@@ -778,24 +713,62 @@ function fp_fmod(x, y, \
        if (isinf(y))
                return x
 
-       sign = 1
+       signx = 1
        if (x < 0) {
-               sign = -1
+               signx = -1
                x = -x
        }
+
        if (y < 0)
                y = -y
 
-       if (x < y) # nothing to do
-               return sign * x
+       # x > 0, y > 0
+
+       zy = frexpl(y, e)
+       ey = e[0]
+
+       zx = frexpl(x, e)
+       ex = e[0]
+
+       exy = ex - ey
+       while (exy > 1) {
+               while (zx < z1 && exy > 0) {
+                       zx *= 2;
+                       ex--;
+               }
+               if (exy < 0)
+                       break
+               zx -=  zy
+
+               if (exy >= 1024) {
+                       # avoid overflow in 2^n computation
+                       txy = exy
+                       ty = y  
+                       while (txy >= 1024) {
+                               ty *= pow2ld(1024)
+                               txy -= 1024
+                       }
+                       x -= pow2ld(txy) * ty
+               } else  
+                       x -= pow2ld(exy) * y
+       }
 
-       q = fp_floor(x / y)
+       while (x > y)
+               x -= y
 
-       # FIXME -- see above, and also consider integer overflow (q >= 
2^LDBL_MANT_DIG)
-       return sign * (x - q * y)
+       if (signx > 0) {
+               if (x < 0)
+                       x += y
+       } else {
+               x = -x
+               if (x > 0)
+                       x -= y
+       }
+       return x
 }
 
 
+
 # fixprec -- fixes "0.18e" output
 
 function fixprec(str, numdigs, \
diff --git a/misc/gawk_math.c b/misc/gawk_math.c
index 27ca215..bafe833 100644
--- a/misc/gawk_math.c
+++ b/misc/gawk_math.c
@@ -51,8 +51,10 @@
 static AWKLDBL taylor_exp(AWKLDBL x);
 static AWKLDBL taylor_cos(AWKLDBL x);
 static AWKLDBL taylor_sin(AWKLDBL x);
-static AWKLDBL euler_atan_1(AWKLDBL x);
+static AWKLDBL arctan__p(AWKLDBL y, AWKLDBL x);
 static AWKLDBL gawk_frexpl(AWKLDBL x, int *exponent);
+static int gawk_rem_pio2l(AWKLDBL x, AWKLDBL *y);
+
 
 /* gawk_sinl --- Compute sin(x) */
 
@@ -110,12 +112,35 @@ gawk_sinl(AWKLDBL x)
                        sinval = taylor_sin(-x);
                        break;
                default:
-                       break;
+                       cant_happen();
                }
        } else {
-               /* FIXME -- Payne and Hanek Reduction Algorithm ? */
+               AWKLDBL y[2];
+               int n;
+
+               /* Payne and Hanek Reduction Algorithm */
+
+               n = gawk_rem_pio2l(x, y);
+               assert(n >= 0);
+
+               /* We have no use for the tail part y[1]; Expect about 15-16 
sig digs of precision. */
 
-               sinval = (AWKLDBL) sin( (double) x);
+               switch (n & 3) {
+               case 0:
+                       sinval = taylor_sin(y[0]);
+                       break;
+               case 1:
+                       sinval = taylor_cos(y[0]);
+                       break;
+               case 2:
+                       sinval = -taylor_sin(y[0]);
+                       break;
+               case 3:
+                       sinval = -taylor_cos(y[0]);
+                       break;
+               default:
+                       cant_happen();
+               }
        }
 
        return sign > 0 ? sinval : -sinval;
@@ -174,12 +199,34 @@ gawk_cosl(AWKLDBL x)
                        cosval = -taylor_cos(-x);
                        break;
                default:
-                       break;
+                       cant_happen();
                }
        } else {
-               /* FIXME Payne and Hanek Reduction Algorithm ? */
+               /* Payne and Hanek Reduction Algorithm */
+               AWKLDBL y[2];
+               int n;
+
+               n = gawk_rem_pio2l(x, y);
+               assert(n >= 0);
 
-               cosval = (AWKLDBL) cos( (double) x);
+               /* We have no use for the tail part y[1]; Expect about 15-16 
sig digs of precision. */
+
+               switch (n & 3) {
+               case 0:
+                       cosval = taylor_cos(y[0]);
+                       break;
+               case 1:
+                       cosval = -taylor_sin(y[0]);
+                       break;
+               case 2:
+                       cosval = -taylor_cos(y[0]);
+                       break;
+               case 3:
+                       cosval = taylor_sin(y[0]);
+                       break;
+               default:
+                       cant_happen();
+               }
        }
 
        return sign > 0 ? cosval : -cosval;
@@ -229,8 +276,10 @@ gawk_atan2l(AWKLDBL y, AWKLDBL x)
                        y = -y;
                }
                if (y > x)
-                       return sign * (GAWK_PI_2 - euler_atan_1(y / x));
-               return sign * euler_atan_1(x / y);
+                       return sign * (GAWK_PI_2 - arctan__p(x, y));
+
+               /* for y/x <= 1.0e-20, use atan2(y, x) ~ y/x */
+               return y <= LDC(1.0e-20) * x ? sign * y / x : sign * 
arctan__p(y, x);
        }
 
        if (x < _0L) {
@@ -241,13 +290,19 @@ gawk_atan2l(AWKLDBL y, AWKLDBL x)
                if (y < _0L) {
                        y = -y;
                        if (y > x)
-                               return -GAWK_PI_2 - euler_atan_1(y / x);
-                       return euler_atan_1(x / y) - GAWK_PI;
+                               return -GAWK_PI_2 - arctan__p(x, y);
+                       /*
+                        * XXX: for small y/x, single term approximation ?
+                        *      (y / x - GAWK_PI_LOW) - GAWK_PI_HIGH
+                        */ 
+                       return arctan__p(y, x) - GAWK_PI;
                }
                /* y > 0 */
                if (y > x)
-                       return GAWK_PI_2 + euler_atan_1(y / x);
-               return - euler_atan_1(x / y) + GAWK_PI;
+                       return GAWK_PI_2 + arctan__p(x, y);
+
+               /* XXX: for small y/x, single term approximation ? */ 
+               return -arctan__p(y, x) + GAWK_PI;
        }
 
        /*
@@ -300,7 +355,6 @@ gawk_logl(AWKLDBL x)
        frac = gawk_frexpl(x, & iexp);  /* frac in [1, 2) */
        exponent = (AWKLDBL) iexp;
 
-
        /*
         * arctanh(x) series has faster convergence when x is close to 1.
         * Perform a range reduction so that 1 / sqrt(2) <= x <= sqrt(2).
@@ -357,7 +411,7 @@ gawk_expl(AWKLDBL x)
                return _0L;
        if (x == _0L)
                return _1L;
-       if (x >= (AWKLDBL) LDBL_MAX_EXP * GAWK_LOG2)    /* overflow */
+       if (x >= (AWKLDBL) GAWK_LDBL_MAX_EXP * GAWK_LOG2)       /* overflow */
                return GAWK_INFINITY;
        if (x <= (AWKLDBL) (LDBL_MIN_EXP - LDBL_MANT_DIG - 1) * GAWK_LOG2)      
/* underflow */
                return _0L;
@@ -379,7 +433,7 @@ gawk_expl(AWKLDBL x)
 
                AWKLDBL y;
 
-               /* High precision calculation using limited precision float */
+               /* Extra precision calculation using limited precision float */
                /*
                 * We need to calculate x - k * log2 with extra precision. If k 
is not a power
                 * of 2, it would require more than LDBL_MANT_DIG bits for the 
product
@@ -400,33 +454,27 @@ gawk_expl(AWKLDBL x)
        return sign < 0 ? (_1L / expval) : expval;
 }
 
-static AWKLDBL
-gawk_fmodl(AWKLDBL x, AWKLDBL y)
-{
-       return fmod( (double) x, (double) y);
-}
-
 
-#define        GAWK_LDBL_INTEGER       1
-#define        GAWK_LDBL_EVEN_INTEGER  2
-#define        GAWK_LDBL_ODD_INTEGER   4
+#define        GAWK_LDBL_INTEGER       0x01
+#define        GAWK_LDBL_EVEN_INTEGER  0x02
+#define        GAWK_LDBL_ODD_INTEGER   0x04
 
-/* gawk_is_integer__p --- is x an (even or odd) integer ? */
+/* gawk_integer__p --- is x an (even or odd) integer ? */
 
 static unsigned int
-gawk_is_integer__p(AWKLDBL x)
+gawk_integer__p(AWKLDBL x)
 {
        AWKLDBL ival;
        unsigned ret = 0;
 
        if (isnan(x) || isinf(x))
-               return (unsigned int) false;
+               return 0;
        if (x < _0L)
                x = -x;
        if (x < _1L)
-               return (unsigned int) false;
+               return 0;
        if ((ival = double_to_int(x)) != x)
-               return (unsigned int) false;
+               return 0;
        ret = GAWK_LDBL_INTEGER;
        if (ival >= pow2ld(LDBL_MANT_DIG))
                ret |= GAWK_LDBL_EVEN_INTEGER;
@@ -440,8 +488,8 @@ gawk_is_integer__p(AWKLDBL x)
        return ret;
 }
 
-#define gawk_is_integer(x)     ((gawk_is_integer__p(x) & GAWK_LDBL_INTEGER) != 
0)
-#define gawk_is_odd_integer(x) ((gawk_is_integer__p(x) & 
GAWK_LDBL_ODD_INTEGER) != 0)
+#define gawk_is_integer(x)     ((gawk_integer__p(x) & GAWK_LDBL_INTEGER) != 0)
+#define gawk_is_odd_integer(x) ((gawk_integer__p(x) & GAWK_LDBL_ODD_INTEGER) 
!= 0)
 
 /* gawk_powl --- Compute x^y */
 
@@ -537,11 +585,12 @@ gawk_powl(AWKLDBL x, AWKLDBL y)
                }
                expval *= gawk_expl(frac * gawk_logl(x));
        } else
-               expval = gawk_expl(y * gawk_logl(x));   /* XXX: likely infinity 
or zero */ 
+               expval = gawk_expl(y * gawk_logl(x));   /* XXX: most likely 
infinity ? */ 
 
        return sign > 0 ? expval : (_1L / expval);
 }
 
+
 /* gawk_sqrtl --- Compute sqrt(x) using Newton's method */
 
 static AWKLDBL
@@ -587,10 +636,86 @@ gawk_sqrtl(AWKLDBL x)
        yn = (yn + x / yn) / _2L;
        yn = (yn + x / yn) / _2L;
        yn = (yn + x / yn) / _2L;
-
        return yn;
 }
 
+/*
+ * gawk_fmodl --- Compute the floating-point remainder of dividing x by y
+ *     Method: shift and subtract.
+ */
+
+static AWKLDBL
+gawk_fmodl(AWKLDBL x, AWKLDBL y)
+{
+       AWKLDBL zx, zy, q;
+       int ex, ey, exy;
+       int signx = 1;
+       unsigned low, high, mid;
+
+       if (isnan(x) || isnan(y) 
+               || isinf(x) || y == _0L         /* XXX: set errno = EDOM ? */
+       )
+               return GAWK_NAN;
+
+       if (x == _0L)
+               return x;       /* +0 or -0 */
+       if (isinf(y))
+               return x;
+       
+       if (x < _0L) {
+               signx = -1;
+               x = -x;
+       }
+       if (y < _0L)
+               y = -y;
+
+       /* x > 0, y > 0 */
+       zy = gawk_frexpl(y, & ey);
+       zx = gawk_frexpl(x, & ex);
+       exy = ex - ey;
+       while (exy > 1) {
+               if (zx == _0L)
+                       return signx * _0L;
+
+               while (zx < zy && exy > 0) {
+                       zx *= 2;
+                       exy--;
+               }
+               if (exy < 0)
+                       break;
+
+               zx -= zy;
+
+#define        GAWK_LDBL_MAX_10_EXP    ((GAWK_LDBL_MAX_EXP + 1) / 4)
+               if (exy >= GAWK_LDBL_MAX_10_EXP) {
+                       /* Avoid possible overflow in 2^n computation */
+ 
+                       AWKLDBL tmp_exy, tmp_y;
+                       tmp_exy = exy;
+                       tmp_y = y;
+                       while (tmp_exy >= GAWK_LDBL_MAX_10_EXP) {
+                               tmp_y *= pow2ld(GAWK_LDBL_MAX_10_EXP);
+                               tmp_exy -= GAWK_LDBL_MAX_10_EXP;
+                       }
+                       x -= pow2ld(tmp_exy) * tmp_y;
+               } else
+                       x -= pow2ld(exy) * y;
+       }
+#undef GAWK_LDBL_MAX_10_EXP
+
+       while (x >= y)
+               x -= y;
+       if (signx > 0) {
+               if (x < _0L)
+                       x += y;
+       } else {
+               x = -x;
+               if (x > _0L)
+                       x -= y;
+       }
+       return x;
+}
+
 
 /*
  * gawk_frexpl --- split the number x into a normalized fraction and an 
exponent.
@@ -610,7 +735,7 @@ gawk_frexpl(AWKLDBL x, int *exponent)
 
        low = 0;
        if (x > _2L) {
-               high = LDBL_MAX_EXP - 1;        /* XXX: should be 4 * 
LDBL_MAX_EXP - 1 if FLT_RADIX = 16 ? */
+               high = GAWK_LDBL_MAX_EXP - 1;
                while (low <= high) {
                        mid = (low + high) / 2;
                        y = x / pow2ld(mid);
@@ -622,7 +747,7 @@ gawk_frexpl(AWKLDBL x, int *exponent)
                x /= pow2ld(low);
                *exponent = low;
        } else if (x < _1L) {
-               high = LDBL_MAX_EXP - 1;        /* could be -LDBL_MIN_EXP, but 
no harm in using LDBL_MAX_EXP */
+               high = GAWK_LDBL_MAX_EXP - 1;   /* could be -LDBL_MIN_EXP, but 
no harm in using LDBL_MAX_EXP */
                while (low <= high) {
                        mid = (low + high) / 2;
                        y =  x * pow2ld(mid);
@@ -670,8 +795,7 @@ taylor_exp(AWKLDBL x)
 
        k = 1;
        while (x > LDC(0.001)) {
-               /* XXX: For x <= 0.001, max(k) = 10, and max # of terms 6 
(80-bit) / 10 (128-bit) */
-               
+               /* XXX: For x <= 0.001, max(k) = 10, and max # of terms 6 
(80-bit) / 10 (128-bit) */    
                x /= _2L; 
                k++;
        }
@@ -693,7 +817,10 @@ taylor_exp(AWKLDBL x)
        return y + _1L;
 }
 
-/* taylor_sin --- Compute sin(x) using Taylor series */
+/*
+ * taylor_sin --- Compute sin(x) using Taylor series
+ *     sin(x) = (x - x^3/3!) + (x^5/5! - x^7/7!) + ...
+ */
 
 static AWKLDBL
 taylor_sin(AWKLDBL x)
@@ -701,12 +828,15 @@ taylor_sin(AWKLDBL x)
        AWKLDBL xpow, xpow_odd, sinval;
        AWKLDBL err, term, fact;
        unsigned int i;
-
-       assert(x >= _0L);
+       int sign = 1;
 
        if (x == _0L)
                return x;
-
+       if (x < _0L) {
+               sign = -1;
+               x = -x;
+       }
+       
        i = 3;
        fact = LDC(6.0);        /* 3! */
        xpow = x * x;
@@ -727,10 +857,13 @@ taylor_sin(AWKLDBL x)
                sinval += term;
                err = term / sinval;
        } while (err > REL_ERROR);
-       return sinval;
+       return sign > 0 ? sinval : -sinval;
 }
 
-/* taylor_cos --- Compute cos(x) using Taylor series */
+/*
+ * taylor_cos --- Compute cos(x) using Taylor series
+ *     cos(x) = (1 - x^2/2!) + (x^4/4! - x^6/6!)...
+ */
 
 static AWKLDBL
 taylor_cos(AWKLDBL x)
@@ -741,6 +874,8 @@ taylor_cos(AWKLDBL x)
 
        if (x == _0L)
                return _1L;
+       if (x < _0L)
+               x = -x;
 
        i = 2;
        fact = _2L;     /* 2! */
@@ -765,12 +900,69 @@ taylor_cos(AWKLDBL x)
        return cosval;
 }
 
-/* euler_atan_1 --- Compute Euler arctan(1/x) approximation */ 
+
+/*
+ * taylor_atan --- Compute arctan(x) using Taylor series
+ *     arctan(x) = (x - x^3/3) + (x^5/5 - x^7/7) + ...
+ */
 
 static AWKLDBL
-euler_atan_1(AWKLDBL x)
+taylor_atan(AWKLDBL x)
 {
-       AWKLDBL xpow2_plus_one, term, sum, err;
+       AWKLDBL xpow, xpow_odd, atanval;
+       AWKLDBL err, term;
+       unsigned int i;
+       int inverse = 0, mult = 1;
+       
+       /* x >= 0.1 */
+
+       assert(x > _0L);
+       if (x > _1L) {
+               /* arctan(x) = pi / 2 - aractan(1 / x), x > 0 */
+               inverse = 1;
+               x = _1L / x;
+       }
+
+       /* x <= 1.0 */
+
+       /* range reduction --- arctan(x) = 2 arctan(x / (1 + sqrt(1 + x^2))) */
+       while (x > LDC(0.05)) {
+               mult *= 2;
+               x = x / (_1L + gawk_sqrtl(_1L + x * x));
+       } 
+
+       /* maximum ~14 terms for 128-bit and ~8 for 80-bit */
+       
+       i = 3;
+       xpow = x * x;
+       xpow_odd = xpow * x;
+       atanval = x - xpow_odd / ((AWKLDBL) i);
+
+       do {
+               i += 2;
+               xpow_odd *= xpow;
+               term = xpow_odd / ((AWKLDBL) i);
+
+               i += 2;
+               xpow_odd *= xpow;
+               term -= xpow_odd / ((AWKLDBL) i);
+
+               atanval += term;
+               err = term / atanval;
+       } while (err > REL_ERROR);
+
+       if (inverse)
+               return GAWK_PI_2 - atanval * ((AWKLDBL) mult);
+       return atanval * ((AWKLDBL) mult);
+}
+
+
+/* arctan__p --- Compute arctan(y / x) */ 
+
+static AWKLDBL
+arctan__p(AWKLDBL y, AWKLDBL x)
+{
+       AWKLDBL z, xpow2_plus_one, term, atanval, err;
        int sign = 1;
        unsigned int i;
 
@@ -781,28 +973,106 @@ euler_atan_1(AWKLDBL x)
         *      y = (x^2) / (1 + x^2) and -1 <= x <= 1
         *
         * Substituting x = 1/x, for x >= 1
-        *      atan(1/x) = (x / (1 + x^2))    + (2/3) * (x / (1 + x^2)^2)
-        *                                     + (2*4/(3*5)) * (x / (1 + x^2)^3)
-        *                                     + (2*4*6/(3*5*7)) * (x / (1 + 
x^2)^4) + ...
+        *      atan(1/x) = (x / (1 + x^2))  + (2/3) * (x / (1 + x^2)^2)
+        *                               + (2*4/(3*5)) * (x / (1 + x^2)^3)
+        *                               + (2*4*6/(3*5*7)) * (x / (1 + x^2)^4) 
+ ...
         */
 
-       if (x < _0L) {
+       z = x / y;
+       if (z < _0L) {
                sign = -1;
-               x = -x;
+               z = -z;
        }
+       assert(z > _1L);
 
-       xpow2_plus_one = x * x + _1L;
-       term = x / xpow2_plus_one;
-       sum = term;
+       if (z <= LDC(20.0)) {
+               /* For y / x  >= 0.05, Euler atan is slow! */
+               atanval = taylor_atan(_1L / z);
+               return sign > 0 ? atanval : -atanval;
+       }
+
+       /* maximum ~14 terms for 128-bit and ~8 for 80-bit */
+
+       xpow2_plus_one = z * z + _1L;
+       term = z / xpow2_plus_one;
+       atanval = term;
        i = 0;
 
        do {
                term *= (AWKLDBL) (i + 2);
                term /= ((AWKLDBL) (i + 3)) * xpow2_plus_one;
                i += 2;
-               sum += term;
-               err = term / sum;
+               atanval += term;
+               err = term / atanval;
        } while (err > REL_ERROR);
 
-       return sign > 0 ? sum : -sum;
+       return sign > 0 ? atanval : -atanval;
+}
+
+
+/* gawk_scalbn --- return X * 2^E */
+
+static inline double
+gawk_scalbn(double x, int e)
+{
+       if (e >= 0)
+               return x * ((double) pow2ld(e));
+       return x / ((double) pow2ld(-e));
+}
+
+#define scalbn gawk_scalbn
+#include "rem_pio2.c"
+
+
+/* gawk_rem_pio2l --- return the x remainder pi/2 in y[0], y[1] */
+
+static int
+gawk_rem_pio2l(AWKLDBL x, AWKLDBL *y)
+{
+       AWKLDBL t, w;
+       int e0;
+       int n, i;
+#if    (GAWK_LDBL_FRAC_BITS == 113 && GAWK_LDBL_MAX_EXP == 16384) 
+       double tx[5];
+       double ty[3];
+       int nx = 5;
+       int prec = 3;
+#else
+       double tx[3];
+       double ty[2];
+       int nx = 3;
+       int prec = 2;
+#endif
+
+       (void) gawk_frexpl(x, & e0);
+       e0 -=  23;
+       if (e0 > 0)
+               x /= pow2ld(e0);
+       else
+               x *= pow2ld(-e0);
+
+       for (i = 0; i < nx - 1; i++) {
+               tx[i] = floor(x);
+               x -= (AWKLDBL) tx[i];
+               x *= pow2ld(24);
+       }
+       tx[nx - 1] = x;
+
+       while (nx > 0 && tx[nx - 1] == _0L)     /* skip zero terms */
+                nx--;
+
+       n = __kernel_rem_pio2(tx, ty, e0, nx, prec);
+
+       /* The result is in 3 or 2 C-doubles, convert into AWKLDBLs. */
+
+#if    (GAWK_LDBL_FRAC_BITS == 113 && GAWK_LDBL_MAX_EXP == 16384)
+       t = (AWKLDBL) ty[2] + (AWKLDBL) ty[1];
+#else
+       t = (AWKLDBL) ty[1];
+#endif
+       w = (AWKLDBL) ty[0];
+
+       y[0] = w + t;
+       y[1] = t - (y[0] - w);
+       return n;
 }
diff --git a/misc/ldbl-tests/atan2.awk b/misc/ldbl-tests/atan2.awk
new file mode 100644
index 0000000..e4ebb05
--- /dev/null
+++ b/misc/ldbl-tests/atan2.awk
@@ -0,0 +1,45 @@
+BEGIN {
+       x1 = 0.1
+       x2 = 131.4321211
+       x3 = 1.1234567e100
+}
+{
+       r1 = atan2(x1, $1)
+       rm1 = atan2(-x1, $1)
+       t1 = atan2($1, x1)
+       tm1 = atan2($1, -x1)
+       r2 = atan2(x2, $1)
+       rm2 = atan2(-x2, $1)
+       t2 = atan2($1, x2)
+       tm2 = atan2($1, -x2)
+       r3 = atan2(x3, $1)
+       rm3 = atan2(-x3, $1)
+       t3 = atan2($1, x3)
+       tm3 = atan2($1, -x3)
+
+       # don't have -M IEEE emulation for 64-bit binary
+       # need to replace huge values with infinities.
+       # "quad" and 64-bit long double has same exponent range.
+       # This does not effect the binary formats B0, B1
+
+       save_PREC = PREC
+       PREC = "quad"
+       r1 += 0; r2 += 0; r3 += 0
+       t1 += 0; t2 += 0; t3 += 0
+       rm1 += 0; rm2 += 0; rm3 += 0
+       tm1 += 0; tm2 += 0; tm3 += 0
+
+       printf("%*.*e\n", 0, DIG, r1)
+       printf("%*.*e\n", 0, DIG, rm1)
+       printf("%*.*e\n", 0, DIG, t1)
+       printf("%*.*e\n", 0, DIG, tm1)
+       printf("%*.*e\n", 0, DIG, r2)
+       printf("%*.*e\n", 0, DIG, rm2)
+       printf("%*.*e\n", 0, DIG, t2)
+       printf("%*.*e\n", 0, DIG, tm2)
+       printf("%*.*e\n", 0, DIG, r3)
+       printf("%*.*e\n", 0, DIG, rm3)
+       printf("%*.*e\n", 0, DIG, t3)
+       printf("%*.*e\n", 0, DIG, tm3)
+       PREC = save_PREC
+}
diff --git a/misc/ldbl-tests/cos.awk b/misc/ldbl-tests/cos.awk
new file mode 100644
index 0000000..048ef91
--- /dev/null
+++ b/misc/ldbl-tests/cos.awk
@@ -0,0 +1,11 @@
+$1 >= 0x100000000 {
+       #
+       # The code in rem_pio2.c (Payne and Hanek Reduction Algorithm)
+       # provides only C-double precision.
+       #
+       printf("%0.15e\n", cos($1))
+       next
+}
+{
+       printf("%*.*e\n", 0, DIG, cos($1))
+}
diff --git a/misc/ldbl-tests/fmod.awk b/misc/ldbl-tests/fmod.awk
new file mode 100644
index 0000000..eb742a3
--- /dev/null
+++ b/misc/ldbl-tests/fmod.awk
@@ -0,0 +1,52 @@
+BEGIN {
+       x1 = 0.1
+       x2 = 131.4321211
+       x3 = 1.1234567e100
+}
+{
+       r1 = x1 % $1
+       rm1 = -x1 % $1
+       t1 = $1 % x1
+       tm1 = $1 % -x1
+       r2 = x2 % $1
+       rm2 = -x2 % $1
+       t2 = $1 % x2
+       tm2 = $1 % -x2
+       r3 = x3 % $1
+       rm3 = -x3 % $1
+       t3 = $1 % x3
+       tm3 = $1 % -x3
+
+       # don't have -M IEEE emulation for 64-bit binary
+       # need to replace huge values with infinities.
+       # "quad" and 64-bit long double has same exponent range.
+       # This does not effect the binary formats B0, B1
+
+#      save_PREC = PREC
+#      PREC = "quad"
+#      r1 += 0; r2 += 0; r3 += 0
+#      t1 += 0; t2 += 0; t3 += 0
+#      rm1 += 0; rm2 += 0; rm3 += 0
+#      tm1 += 0; tm2 += 0; tm3 += 0
+
+       printf("%*.*e\n", 0, DIG, r1)
+       printf("%*.*e\n", 0, DIG, rm1)
+       printf("%*.*e\n", 0, DIG, t1)
+       printf("%*.*e\n", 0, DIG, tm1)
+       printf("%*.*e\n", 0, DIG, r2)
+       printf("%*.*e\n", 0, DIG, rm2)
+       printf("%*.*e\n", 0, DIG, t2)
+       printf("%*.*e\n", 0, DIG, tm2)
+       printf("%*.*e\n", 0, DIG, r3)
+       printf("%*.*e\n", 0, DIG, rm3)
+       printf("%*.*e\n", 0, DIG, t3)
+       printf("%*.*e\n", 0, DIG, tm3)
+#      PREC = save_PREC
+}
+
+END {
+       printf("%*.*e\n", 0, DIG, 8 % 4)
+       printf("%*.*e\n", 0, DIG, -8 % 4)
+       printf("%*.*e\n", 0, DIG, 8 % -4)
+       printf("%*.*e\n", 0, DIG, -8 % -4)
+}
diff --git a/misc/ldbl-tests/sin.awk b/misc/ldbl-tests/sin.awk
new file mode 100644
index 0000000..f79dc72
--- /dev/null
+++ b/misc/ldbl-tests/sin.awk
@@ -0,0 +1,11 @@
+$1 >= 0x100000000 {
+       #
+       # The code in rem_pio2.c (Payne and Hanek Reduction Algorithm)       
+       # provides only C-double precision.        
+       #
+       printf("%0.15e\n", sin($1))
+       next
+}
+{
+       printf("%*.*e\n", 0, DIG, sin($1))
+}
diff --git a/misc/rem_pio2.c b/misc/rem_pio2.c
new file mode 100644
index 0000000..8d8621f
--- /dev/null
+++ b/misc/rem_pio2.c
@@ -0,0 +1,467 @@
+/* Source: freebsd/master/lib/msun/src/k_rem_pio2.c */
+
+/* @(#)k_rem_pio2.c 1.3 95/01/18 */
+/*
+ * ====================================================
+ * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
+ *
+ * Developed at SunSoft, a Sun Microsystems, Inc. business.
+ * Permission to use, copy, modify, and distribute this
+ * software is freely granted, provided that this notice 
+ * is preserved.
+ * ====================================================
+ */
+#if 0
+#ifdef FLT_EVAL_METHOD
+/*
+* Attempt to get strict C99 semantics for assignment with non-C99 compilers.
+*/
+#if FLT_EVAL_METHOD == 0 || __GNUC__ == 0
+#define STRICT_ASSIGN(type, lval, rval) ((lval) = (rval))
+#else
+#define STRICT_ASSIGN(type, lval, rval) do { \
+volatile type __lval; \
+\
+if (sizeof(type) >= sizeof(long double)) \
+(lval) = (rval); \
+else { \
+__lval = (rval); \
+(lval) = __lval; \
+} \
+} while (0)
+#endif
+#endif /* FLT_EVAL_METHOD */
+#endif
+
+
+/* loose extra precision */
+#define STRICT_ASSIGN(type, lval, rval) do { \
+volatile type __lval; \
+__lval = (rval); \
+(lval) = __lval; \
+} while (0)
+
+
+/*
+ * __kernel_rem_pio2(x,y,e0,nx,prec)
+ * double x[],y[]; int e0,nx,prec;
+ * 
+ * __kernel_rem_pio2 return the last three digits of N with 
+ *             y = x - N*pi/2
+ * so that |y| < pi/2.
+ *
+ * The method is to compute the integer (mod 8) and fraction parts of 
+ * (2/pi)*x without doing the full multiplication. In general we
+ * skip the part of the product that are known to be a huge integer (
+ * more accurately, = 0 mod 8 ). Thus the number of operations are
+ * independent of the exponent of the input.
+ *
+ * (2/pi) is represented by an array of 24-bit integers in ipio2[].
+ *
+ * Input parameters:
+ *     x[]     The input value (must be positive) is broken into nx 
+ *             pieces of 24-bit integers in double precision format.
+ *             x[i] will be the i-th 24 bit of x. The scaled exponent 
+ *             of x[0] is given in input parameter e0 (i.e., x[0]*2^e0 
+ *             match x's up to 24 bits.
+ *
+ *             Example of breaking a double positive z into x[0]+x[1]+x[2]:
+ *                     e0 = ilogb(z)-23
+ *                     z  = scalbn(z,-e0)
+ *             for i = 0,1,2
+ *                     x[i] = floor(z)
+ *                     z    = (z-x[i])*2**24
+ *
+ *
+ *     y[]     output result in an array of double precision numbers.
+ *             The dimension of y[] is:
+ *                     24-bit  precision       1
+ *                     53-bit  precision       2
+ *                     64-bit  precision       2
+ *                     113-bit precision       3
+ *             The actual value is the sum of them. Thus for 113-bit
+ *             precison, one may have to do something like:
+ *
+ *             long double t,w,r_head, r_tail;
+ *             t = (long double)y[2] + (long double)y[1];
+ *             w = (long double)y[0];
+ *             r_head = t + w;
+ *             r_tail = w - (r_head - t);
+ *
+ *     e0      The exponent of x[0]. Must be <= 16360 or you need to
+ *              expand the ipio2 table.
+ *
+ *     nx      dimension of x[]
+ *
+ *     prec    an integer indicating the precision:
+ *                     0       24  bits (single)
+ *                     1       53  bits (double)
+ *                     2       64  bits (extended)
+ *                     3       113 bits (quad)
+ *
+ * External function:
+ *     double scalbn(), floor();
+ *
+ *
+ * Here is the description of some local variables:
+ *
+ *     jk      jk+1 is the initial number of terms of ipio2[] needed
+ *             in the computation. The minimum and recommended value
+ *             for jk is 3,4,4,6 for single, double, extended, and quad.
+ *             jk+1 must be 2 larger than you might expect so that our
+ *             recomputation test works. (Up to 24 bits in the integer
+ *             part (the 24 bits of it that we compute) and 23 bits in
+ *             the fraction part may be lost to cancelation before we
+ *             recompute.)
+ *
+ *     jz      local integer variable indicating the number of 
+ *             terms of ipio2[] used. 
+ *
+ *     jx      nx - 1
+ *
+ *     jv      index for pointing to the suitable ipio2[] for the
+ *             computation. In general, we want
+ *                     ( 2^e0*x[0] * ipio2[jv-1]*2^(-24jv) )/8
+ *             is an integer. Thus
+ *                     e0-3-24*jv >= 0 or (e0-3)/24 >= jv
+ *             Hence jv = max(0,(e0-3)/24).
+ *
+ *     jp      jp+1 is the number of terms in PIo2[] needed, jp = jk.
+ *
+ *     q[]     double array with integral value, representing the
+ *             24-bits chunk of the product of x and 2/pi.
+ *
+ *     q0      the corresponding exponent of q[0]. Note that the
+ *             exponent for q[i] would be q0-24*i.
+ *
+ *     PIo2[]  double precision array, obtained by cutting pi/2
+ *             into 24 bits chunks. 
+ *
+ *     f[]     ipio2[] in floating point 
+ *
+ *     iq[]    integer array by breaking up q[] in 24-bits chunk.
+ *
+ *     fq[]    final product of x*(2/pi) in fq[0],..,fq[jk]
+ *
+ *     ih      integer. If >0 it indicates q[] is >= 0.5, hence
+ *             it also indicates the *sign* of the result.
+ *
+ */
+
+
+/*
+ * Constants:
+ * The hexadecimal values are the intended ones for the following 
+ * constants. The decimal values may be used, provided that the 
+ * compiler will convert from decimal to binary accurately enough 
+ * to produce the hexadecimal values shown.
+ */
+
+static const int init_jk[] = {3,4,4,6}; /* initial value for jk */
+
+/*
+ * Table of constants for 2/pi, 396 Hex digits (476 decimal) of 2/pi
+ *
+ *             integer array, contains the (24*i)-th to (24*i+23)-th 
+ *             bit of 2/pi after binary point. The corresponding 
+ *             floating value is
+ *
+ *                     ipio2[i] * 2^(-24(i+1)).
+ *
+ * NB: This table must have at least (e0-3)/24 + jk terms.
+ *     For quad precision (e0 <= 16360, jk = 6), this is 686.
+ */
+static const int32_t ipio2[] = {
+0xA2F983, 0x6E4E44, 0x1529FC, 0x2757D1, 0xF534DD, 0xC0DB62, 
+0x95993C, 0x439041, 0xFE5163, 0xABDEBB, 0xC561B7, 0x246E3A, 
+0x424DD2, 0xE00649, 0x2EEA09, 0xD1921C, 0xFE1DEB, 0x1CB129, 
+0xA73EE8, 0x8235F5, 0x2EBB44, 0x84E99C, 0x7026B4, 0x5F7E41, 
+0x3991D6, 0x398353, 0x39F49C, 0x845F8B, 0xBDF928, 0x3B1FF8, 
+0x97FFDE, 0x05980F, 0xEF2F11, 0x8B5A0A, 0x6D1F6D, 0x367ECF, 
+0x27CB09, 0xB74F46, 0x3F669E, 0x5FEA2D, 0x7527BA, 0xC7EBE5, 
+0xF17B3D, 0x0739F7, 0x8A5292, 0xEA6BFB, 0x5FB11F, 0x8D5D08, 
+0x560330, 0x46FC7B, 0x6BABF0, 0xCFBC20, 0x9AF436, 0x1DA9E3, 
+0x91615E, 0xE61B08, 0x659985, 0x5F14A0, 0x68408D, 0xFFD880, 
+0x4D7327, 0x310606, 0x1556CA, 0x73A8C9, 0x60E27B, 0xC08C6B, 
+
+#if LDBL_MAX_EXP > 1024
+#if LDBL_MAX_EXP > 16384
+#error "ipio2 table needs to be expanded"
+#endif
+0x47C419, 0xC367CD, 0xDCE809, 0x2A8359, 0xC4768B, 0x961CA6,
+0xDDAF44, 0xD15719, 0x053EA5, 0xFF0705, 0x3F7E33, 0xE832C2,
+0xDE4F98, 0x327DBB, 0xC33D26, 0xEF6B1E, 0x5EF89F, 0x3A1F35,
+0xCAF27F, 0x1D87F1, 0x21907C, 0x7C246A, 0xFA6ED5, 0x772D30,
+0x433B15, 0xC614B5, 0x9D19C3, 0xC2C4AD, 0x414D2C, 0x5D000C,
+0x467D86, 0x2D71E3, 0x9AC69B, 0x006233, 0x7CD2B4, 0x97A7B4,
+0xD55537, 0xF63ED7, 0x1810A3, 0xFC764D, 0x2A9D64, 0xABD770,
+0xF87C63, 0x57B07A, 0xE71517, 0x5649C0, 0xD9D63B, 0x3884A7,
+0xCB2324, 0x778AD6, 0x23545A, 0xB91F00, 0x1B0AF1, 0xDFCE19,
+0xFF319F, 0x6A1E66, 0x615799, 0x47FBAC, 0xD87F7E, 0xB76522,
+0x89E832, 0x60BFE6, 0xCDC4EF, 0x09366C, 0xD43F5D, 0xD7DE16,
+0xDE3B58, 0x929BDE, 0x2822D2, 0xE88628, 0x4D58E2, 0x32CAC6,
+0x16E308, 0xCB7DE0, 0x50C017, 0xA71DF3, 0x5BE018, 0x34132E,
+0x621283, 0x014883, 0x5B8EF5, 0x7FB0AD, 0xF2E91E, 0x434A48,
+0xD36710, 0xD8DDAA, 0x425FAE, 0xCE616A, 0xA4280A, 0xB499D3,
+0xF2A606, 0x7F775C, 0x83C2A3, 0x883C61, 0x78738A, 0x5A8CAF,
+0xBDD76F, 0x63A62D, 0xCBBFF4, 0xEF818D, 0x67C126, 0x45CA55,
+0x36D9CA, 0xD2A828, 0x8D61C2, 0x77C912, 0x142604, 0x9B4612,
+0xC459C4, 0x44C5C8, 0x91B24D, 0xF31700, 0xAD43D4, 0xE54929,
+0x10D5FD, 0xFCBE00, 0xCC941E, 0xEECE70, 0xF53E13, 0x80F1EC,
+0xC3E7B3, 0x28F8C7, 0x940593, 0x3E71C1, 0xB3092E, 0xF3450B,
+0x9C1288, 0x7B20AB, 0x9FB52E, 0xC29247, 0x2F327B, 0x6D550C,
+0x90A772, 0x1FE76B, 0x96CB31, 0x4A1679, 0xE27941, 0x89DFF4,
+0x9794E8, 0x84E6E2, 0x973199, 0x6BED88, 0x365F5F, 0x0EFDBB,
+0xB49A48, 0x6CA467, 0x427271, 0x325D8D, 0xB8159F, 0x09E5BC,
+0x25318D, 0x3974F7, 0x1C0530, 0x010C0D, 0x68084B, 0x58EE2C,
+0x90AA47, 0x02E774, 0x24D6BD, 0xA67DF7, 0x72486E, 0xEF169F,
+0xA6948E, 0xF691B4, 0x5153D1, 0xF20ACF, 0x339820, 0x7E4BF5,
+0x6863B2, 0x5F3EDD, 0x035D40, 0x7F8985, 0x295255, 0xC06437,
+0x10D86D, 0x324832, 0x754C5B, 0xD4714E, 0x6E5445, 0xC1090B,
+0x69F52A, 0xD56614, 0x9D0727, 0x50045D, 0xDB3BB4, 0xC576EA,
+0x17F987, 0x7D6B49, 0xBA271D, 0x296996, 0xACCCC6, 0x5414AD,
+0x6AE290, 0x89D988, 0x50722C, 0xBEA404, 0x940777, 0x7030F3,
+0x27FC00, 0xA871EA, 0x49C266, 0x3DE064, 0x83DD97, 0x973FA3,
+0xFD9443, 0x8C860D, 0xDE4131, 0x9D3992, 0x8C70DD, 0xE7B717,
+0x3BDF08, 0x2B3715, 0xA0805C, 0x93805A, 0x921110, 0xD8E80F,
+0xAF806C, 0x4BFFDB, 0x0F9038, 0x761859, 0x15A562, 0xBBCB61,
+0xB989C7, 0xBD4010, 0x04F2D2, 0x277549, 0xF6B6EB, 0xBB22DB,
+0xAA140A, 0x2F2689, 0x768364, 0x333B09, 0x1A940E, 0xAA3A51,
+0xC2A31D, 0xAEEDAF, 0x12265C, 0x4DC26D, 0x9C7A2D, 0x9756C0,
+0x833F03, 0xF6F009, 0x8C402B, 0x99316D, 0x07B439, 0x15200C,
+0x5BC3D8, 0xC492F5, 0x4BADC6, 0xA5CA4E, 0xCD37A7, 0x36A9E6,
+0x9492AB, 0x6842DD, 0xDE6319, 0xEF8C76, 0x528B68, 0x37DBFC,
+0xABA1AE, 0x3115DF, 0xA1AE00, 0xDAFB0C, 0x664D64, 0xB705ED,
+0x306529, 0xBF5657, 0x3AFF47, 0xB9F96A, 0xF3BE75, 0xDF9328,
+0x3080AB, 0xF68C66, 0x15CB04, 0x0622FA, 0x1DE4D9, 0xA4B33D,
+0x8F1B57, 0x09CD36, 0xE9424E, 0xA4BE13, 0xB52333, 0x1AAAF0,
+0xA8654F, 0xA5C1D2, 0x0F3F0B, 0xCD785B, 0x76F923, 0x048B7B,
+0x721789, 0x53A6C6, 0xE26E6F, 0x00EBEF, 0x584A9B, 0xB7DAC4,
+0xBA66AA, 0xCFCF76, 0x1D02D1, 0x2DF1B1, 0xC1998C, 0x77ADC3,
+0xDA4886, 0xA05DF7, 0xF480C6, 0x2FF0AC, 0x9AECDD, 0xBC5C3F,
+0x6DDED0, 0x1FC790, 0xB6DB2A, 0x3A25A3, 0x9AAF00, 0x9353AD,
+0x0457B6, 0xB42D29, 0x7E804B, 0xA707DA, 0x0EAA76, 0xA1597B,
+0x2A1216, 0x2DB7DC, 0xFDE5FA, 0xFEDB89, 0xFDBE89, 0x6C76E4,
+0xFCA906, 0x70803E, 0x156E85, 0xFF87FD, 0x073E28, 0x336761,
+0x86182A, 0xEABD4D, 0xAFE7B3, 0x6E6D8F, 0x396795, 0x5BBF31,
+0x48D784, 0x16DF30, 0x432DC7, 0x356125, 0xCE70C9, 0xB8CB30,
+0xFD6CBF, 0xA200A4, 0xE46C05, 0xA0DD5A, 0x476F21, 0xD21262,
+0x845CB9, 0x496170, 0xE0566B, 0x015299, 0x375550, 0xB7D51E,
+0xC4F133, 0x5F6E13, 0xE4305D, 0xA92E85, 0xC3B21D, 0x3632A1,
+0xA4B708, 0xD4B1EA, 0x21F716, 0xE4698F, 0x77FF27, 0x80030C,
+0x2D408D, 0xA0CD4F, 0x99A520, 0xD3A2B3, 0x0A5D2F, 0x42F9B4,
+0xCBDA11, 0xD0BE7D, 0xC1DB9B, 0xBD17AB, 0x81A2CA, 0x5C6A08,
+0x17552E, 0x550027, 0xF0147F, 0x8607E1, 0x640B14, 0x8D4196,
+0xDEBE87, 0x2AFDDA, 0xB6256B, 0x34897B, 0xFEF305, 0x9EBFB9,
+0x4F6A68, 0xA82A4A, 0x5AC44F, 0xBCF82D, 0x985AD7, 0x95C7F4,
+0x8D4D0D, 0xA63A20, 0x5F57A4, 0xB13F14, 0x953880, 0x0120CC,
+0x86DD71, 0xB6DEC9, 0xF560BF, 0x11654D, 0x6B0701, 0xACB08C,
+0xD0C0B2, 0x485551, 0x0EFB1E, 0xC37295, 0x3B06A3, 0x3540C0,
+0x7BDC06, 0xCC45E0, 0xFA294E, 0xC8CAD6, 0x41F3E8, 0xDE647C,
+0xD8649B, 0x31BED9, 0xC397A4, 0xD45877, 0xC5E369, 0x13DAF0,
+0x3C3ABA, 0x461846, 0x5F7555, 0xF5BDD2, 0xC6926E, 0x5D2EAC,
+0xED440E, 0x423E1C, 0x87C461, 0xE9FD29, 0xF3D6E7, 0xCA7C22,
+0x35916F, 0xC5E008, 0x8DD7FF, 0xE26A6E, 0xC6FDB0, 0xC10893,
+0x745D7C, 0xB2AD6B, 0x9D6ECD, 0x7B723E, 0x6A11C6, 0xA9CFF7,
+0xDF7329, 0xBAC9B5, 0x5100B7, 0x0DB2E2, 0x24BA74, 0x607DE5,
+0x8AD874, 0x2C150D, 0x0C1881, 0x94667E, 0x162901, 0x767A9F,
+0xBEFDFD, 0xEF4556, 0x367ED9, 0x13D9EC, 0xB9BA8B, 0xFC97C4,
+0x27A831, 0xC36EF1, 0x36C594, 0x56A8D8, 0xB5A8B4, 0x0ECCCF,
+0x2D8912, 0x34576F, 0x89562C, 0xE3CE99, 0xB920D6, 0xAA5E6B,
+0x9C2A3E, 0xCC5F11, 0x4A0BFD, 0xFBF4E1, 0x6D3B8E, 0x2C86E2,
+0x84D4E9, 0xA9B4FC, 0xD1EEEF, 0xC9352E, 0x61392F, 0x442138,
+0xC8D91B, 0x0AFC81, 0x6A4AFB, 0xD81C2F, 0x84B453, 0x8C994E,
+0xCC2254, 0xDC552A, 0xD6C6C0, 0x96190B, 0xB8701A, 0x649569,
+0x605A26, 0xEE523F, 0x0F117F, 0x11B5F4, 0xF5CBFC, 0x2DBC34,
+0xEEBC34, 0xCC5DE8, 0x605EDD, 0x9B8E67, 0xEF3392, 0xB817C9,
+0x9B5861, 0xBC57E1, 0xC68351, 0x103ED8, 0x4871DD, 0xDD1C2D,
+0xA118AF, 0x462C21, 0xD7F359, 0x987AD9, 0xC0549E, 0xFA864F,
+0xFC0656, 0xAE79E5, 0x362289, 0x22AD38, 0xDC9367, 0xAAE855,
+0x382682, 0x9BE7CA, 0xA40D51, 0xB13399, 0x0ED7A9, 0x480569,
+0xF0B265, 0xA7887F, 0x974C88, 0x36D1F9, 0xB39221, 0x4A827B,
+0x21CF98, 0xDC9F40, 0x5547DC, 0x3A74E1, 0x42EB67, 0xDF9DFE,
+0x5FD45E, 0xA4677B, 0x7AACBA, 0xA2F655, 0x23882B, 0x55BA41,
+0x086E59, 0x862A21, 0x834739, 0xE6E389, 0xD49EE5, 0x40FB49,
+0xE956FF, 0xCA0F1C, 0x8A59C5, 0x2BFA94, 0xC5C1D3, 0xCFC50F,
+0xAE5ADB, 0x86C547, 0x624385, 0x3B8621, 0x94792C, 0x876110,
+0x7B4C2A, 0x1A2C80, 0x12BF43, 0x902688, 0x893C78, 0xE4C4A8,
+0x7BDBE5, 0xC23AC4, 0xEAF426, 0x8A67F7, 0xBF920D, 0x2BA365,
+0xB1933D, 0x0B7CBD, 0xDC51A4, 0x63DD27, 0xDDE169, 0x19949A,
+0x9529A8, 0x28CE68, 0xB4ED09, 0x209F44, 0xCA984E, 0x638270,
+0x237C7E, 0x32B90F, 0x8EF5A7, 0xE75614, 0x08F121, 0x2A9DB5,
+0x4D7E6F, 0x5119A5, 0xABF9B5, 0xD6DF82, 0x61DD96, 0x023616,
+0x9F3AC4, 0xA1A283, 0x6DED72, 0x7A8D39, 0xA9B882, 0x5C326B,
+0x5B2746, 0xED3400, 0x7700D2, 0x55F4FC, 0x4D5901, 0x8071E0,
+#endif
+
+};
+
+static const double PIo2[] = {
+  1.57079625129699707031e+00, /* 0x3FF921FB, 0x40000000 */
+  7.54978941586159635335e-08, /* 0x3E74442D, 0x00000000 */
+  5.39030252995776476554e-15, /* 0x3CF84698, 0x80000000 */
+  3.28200341580791294123e-22, /* 0x3B78CC51, 0x60000000 */
+  1.27065575308067607349e-29, /* 0x39F01B83, 0x80000000 */
+  1.22933308981111328932e-36, /* 0x387A2520, 0x40000000 */
+  2.73370053816464559624e-44, /* 0x36E38222, 0x80000000 */
+  2.16741683877804819444e-51, /* 0x3569F31D, 0x00000000 */
+};
+
+static const double                    
+zero   = 0.0,
+one    = 1.0,
+two24   =  1.67772160000000000000e+07, /* 0x41700000, 0x00000000 */
+twon24  =  5.96046447753906250000e-08; /* 0x3E700000, 0x00000000 */
+
+static int
+__kernel_rem_pio2(double *x, double *y, int e0, int nx, int prec)
+{
+       int32_t jz,jx,jv,jp,jk,carry,n,iq[20],i,j,k,m,q0,ih;
+       double z,fw,f[20],fq[20],q[20];
+
+    /* initialize jk*/
+       jk = init_jk[prec];
+       jp = jk;
+
+    /* determine jx,jv,q0, note that 3>q0 */
+       jx =  nx-1;
+       jv = (e0-3)/24; if(jv<0) jv=0;
+       q0 =  e0-24*(jv+1);
+
+    /* set up f[0] to f[jx+jk] where f[jx+jk] = ipio2[jv+jk] */
+       j = jv-jx; m = jx+jk;
+       for(i=0;i<=m;i++,j++) f[i] = (j<0)? zero : (double) ipio2[j];
+
+    /* compute q[0],q[1],...q[jk] */
+       for (i=0;i<=jk;i++) {
+           for(j=0,fw=0.0;j<=jx;j++) fw += x[j]*f[jx+i-j]; q[i] = fw;
+       }
+
+       jz = jk;
+recompute:
+    /* distill q[] into iq[] reversingly */
+       for(i=0,j=jz,z=q[jz];j>0;i++,j--) {
+           fw    =  (double)((int32_t)(twon24* z));
+           iq[i] =  (int32_t)(z-two24*fw);
+           z     =  q[j-1]+fw;
+       }
+
+    /* compute n */
+       z  = scalbn(z,q0);              /* actual value of z */
+       z -= 8.0*floor(z*0.125);                /* trim off integer >= 8 */
+       n  = (int32_t) z;
+       z -= (double)n;
+       ih = 0;
+       if(q0>0) {      /* need iq[jz-1] to determine n */
+           i  = (iq[jz-1]>>(24-q0)); n += i;
+           iq[jz-1] -= i<<(24-q0);
+           ih = iq[jz-1]>>(23-q0);
+       } 
+       else if(q0==0) ih = iq[jz-1]>>23;
+       else if(z>=0.5) ih=2;
+
+       if(ih>0) {      /* q > 0.5 */
+           n += 1; carry = 0;
+           for(i=0;i<jz ;i++) {        /* compute 1-q */
+               j = iq[i];
+               if(carry==0) {
+                   if(j!=0) {
+                       carry = 1; iq[i] = 0x1000000- j;
+                   }
+               } else  iq[i] = 0xffffff - j;
+           }
+           if(q0>0) {          /* rare case: chance is 1 in 12 */
+               switch(q0) {
+               case 1:
+                  iq[jz-1] &= 0x7fffff; break;
+               case 2:
+                  iq[jz-1] &= 0x3fffff; break;
+               }
+           }
+           if(ih==2) {
+               z = one - z;
+               if(carry!=0) z -= scalbn(one,q0);
+           }
+       }
+
+    /* check if recomputation is needed */
+       if(z==zero) {
+           j = 0;
+           for (i=jz-1;i>=jk;i--) j |= iq[i];
+           if(j==0) { /* need recomputation */
+               for(k=1;iq[jk-k]==0;k++);   /* k = no. of terms needed */
+
+               for(i=jz+1;i<=jz+k;i++) {   /* add q[jz+1] to q[jz+k] */
+                   f[jx+i] = (double) ipio2[jv+i];
+                   for(j=0,fw=0.0;j<=jx;j++) fw += x[j]*f[jx+i-j];
+                   q[i] = fw;
+               }
+               jz += k;
+               goto recompute;
+           }
+       }
+
+    /* chop off zero terms */
+       if(z==0.0) {
+           jz -= 1; q0 -= 24;
+           while(iq[jz]==0) { jz--; q0-=24;}
+       } else { /* break z into 24-bit if necessary */
+           z = scalbn(z,-q0);
+           if(z>=two24) { 
+               fw = (double)((int32_t)(twon24*z));
+               iq[jz] = (int32_t)(z-two24*fw);
+               jz += 1; q0 += 24;
+               iq[jz] = (int32_t) fw;
+           } else iq[jz] = (int32_t) z ;
+       }
+
+    /* convert integer "bit" chunk to floating-point value */
+       fw = scalbn(one,q0);
+       for(i=jz;i>=0;i--) {
+           q[i] = fw*(double)iq[i]; fw*=twon24;
+       }
+
+    /* compute PIo2[0,...,jp]*q[jz,...,0] */
+       for(i=jz;i>=0;i--) {
+           for(fw=0.0,k=0;k<=jp&&k<=jz-i;k++) fw += PIo2[k]*q[i+k];
+           fq[jz-i] = fw;
+       }
+
+    /* compress fq[] into y[] */
+       switch(prec) {
+           case 0:
+               fw = 0.0;
+               for (i=jz;i>=0;i--) fw += fq[i];
+               y[0] = (ih==0)? fw: -fw; 
+               break;
+           case 1:
+           case 2:
+               fw = 0.0;
+               for (i=jz;i>=0;i--) fw += fq[i]; 
+               STRICT_ASSIGN(double,fw,fw);
+               y[0] = (ih==0)? fw: -fw;
+               fw = fq[0]-fw;
+               for (i=1;i<=jz;i++) fw += fq[i];
+               y[1] = (ih==0)? fw: -fw; 
+               break;
+           case 3:     /* painful */
+               for (i=jz;i>0;i--) {
+                   fw      = fq[i-1]+fq[i]; 
+                   fq[i]  += fq[i-1]-fw;
+                   fq[i-1] = fw;
+               }
+               for (i=jz;i>1;i--) {
+                   fw      = fq[i-1]+fq[i]; 
+                   fq[i]  += fq[i-1]-fw;
+                   fq[i-1] = fw;
+               }
+               for (fw=0.0,i=jz;i>=2;i--) fw += fq[i]; 
+               if(ih==0) {
+                   y[0] =  fq[0]; y[1] =  fq[1]; y[2] =  fw;
+               } else {
+                   y[0] = -fq[0]; y[1] = -fq[1]; y[2] = -fw;
+               }
+       }
+       return n&7;
+}
+

-----------------------------------------------------------------------

Summary of changes:
 TODO.LDBL                 |   44 +++--
 long_double.c             |   14 +-
 long_double.h             |   18 +-
 misc/ChangeLog            |    7 +
 misc/Makefile             |   71 +++++--
 misc/float128.c           |   64 +++----
 misc/float80.c            |    5 +-
 misc/fp_math.awk          |  125 +++++--------
 misc/gawk_math.c          |  386 +++++++++++++++++++++++++++++++------
 misc/ldbl-tests/atan2.awk |   45 +++++
 misc/ldbl-tests/cos.awk   |   11 +
 misc/ldbl-tests/fmod.awk  |   52 +++++
 misc/ldbl-tests/sin.awk   |   11 +
 misc/rem_pio2.c           |  467 +++++++++++++++++++++++++++++++++++++++++++++
 14 files changed, 1097 insertions(+), 223 deletions(-)
 create mode 100644 misc/ldbl-tests/atan2.awk
 create mode 100644 misc/ldbl-tests/cos.awk
 create mode 100644 misc/ldbl-tests/fmod.awk
 create mode 100644 misc/ldbl-tests/sin.awk
 create mode 100644 misc/rem_pio2.c


hooks/post-receive
-- 
gawk
[Prev in Thread]
Current Thread
[Next in Thread]
[gawk-diffs] [SCM] gawk branch, long-double, updated. 3a4e6e9c74d0ba859b56ba066850dbba3b15eb22, John Haque <=
Prev by Date: [gawk-diffs] [SCM] gawk branch, cmake, updated. dac059baf5518edf195c873db91eb42af6551586
Next by Date: [gawk-diffs] [SCM] gawk branch, cmake, updated. 251fa761931520336e19fb09af583954a6f481ce
Previous by thread: [gawk-diffs] [SCM] gawk branch, cmake, updated. dac059baf5518edf195c873db91eb42af6551586
Next by thread: [gawk-diffs] [SCM] gawk branch, cmake, updated. 251fa761931520336e19fb09af583954a6f481ce
Index(es):
- Date
- Thread