[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Problem (multiline records) with gawk 3.0.6
From: |
Aharon Robbins |
Subject: |
Re: Problem (multiline records) with gawk 3.0.6 |
Date: |
Wed, 7 Mar 2001 11:28:37 +0200 |
Greetings. Re the bug report below, you have indeed
found a bug. Thanks for the cogent bug report.
Here is a patch.
Thanks,
Arnold
----------------------------------------------------------
*** io.c.save Sun Jul 16 06:13:59 2000
--- io.c Wed Mar 7 11:10:55 2001
***************
*** 1722,1732 ****
/*
* Leading newlines at the beginning of the file
* should be ignored. Whew!
- *
- * Is this code ever executed?
*/
! if (RS_is_null && RESTART(rsre, start) == 0) {
! start += REEND(rsre, start);
goto again;
}
bp = start + RESTART(rsre, start);
--- 1722,1737 ----
/*
* Leading newlines at the beginning of the file
* should be ignored. Whew!
*/
! if (RS_is_null && *start == '\n') {
! /*
! * have to catch the case of a
! * single newline at the front of
! * the record, which the regex
! * doesn't. gurr.
! */
! while (*start == '\n' && start < iop->end)
! start++;
goto again;
}
bp = start + RESTART(rsre, start);
------------------------------------------------------------------------
> From: "Volker Kiefel" <address@hidden>
> To: <address@hidden>
> Subject: Problem (multiline records) with gawk 3.0.6
> Date: Mon, 5 Mar 2001 12:39:29 +0100
>
> Possible bug in gawk v3.0.6
>
> Hello dear maintainer(s) of GAWK,
> hello Arnold Robbins,
>
> Thank you for the excellent GAWK tool which I often use
> (under Windows 95 SE) instead of commercial spreadsheet
> software. Recently I encountered some problems in an
> attempt to process multiple line records. The problem is
> best exemplified with the example in "Effective
> AWK-programming" (chapter 5.7: multiple-line
> records). The data file:
>
> Jane Doe
> 123 Main Street
> Anywhere, SE 12345-6789
>
> John Smith
> 456 Tree-lined Avenue
> Smallville, MW 98765-4321
>
> is processed with the AWK script
>
> BEGIN {
> RS = ""; FS = "\n"
> }
>
> {
> print "Name is: ", $1
> print "Address is: ", $2
> print "City and State are: ", $3
> print ""
> }
>
> A problem arises with one leading newline (empty line)
> in the "addresses" data file before "Jane Doe", the
> (unexpected) output is:
>
> Name is:
> Address is: Jane Doe
> City and State are: 123 Main Street
>
> Name is: John Smith
> Address is: 456 Tree-lined Avenue
> City and State are: Smallville, MW 98765-4321
>
> i. e. there is a shift in output in the first record.
> With RS="" leading empty lines should not influence the
> result. 0 or 2 leading lines produce the expected
> result:
>
> Name is: Jane Doe
> Address is: 123 Main Street
> City and State are: Anywhere, SE 12345-6789
>
> Name is: John Smith
> Address is: 456 Tree-lined Avenue
> City and State are: Smallville, MW 98765-4321
>
> Kerninghan's AWK in its current version does not show
> this unexpected behaviour.
>
> I use gawk v5.0.6 compiled with the mingw32
> implementation of the GCC (v2.95.2) and with the
> djgpp-implementation of GCC. Can this problem be
> addressed in a future version of GAWK?
>
> Yours sincerely
> Volker Kiefel
> (Rostock, Germany)