bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gawk] Computed regex and getline bug / issue


From: Grail Dane
Subject: [bug-gawk] Computed regex and getline bug / issue
Date: Sun, 4 May 2014 17:31:55 +0800

Hello

As part of an exercise in displaying data from a file I have come across an issue which neither myself nor any of the
good people at linuquestions.org have been able to solve and believe it may be a bug within gawk.

Using the following data as an input file:

1 , 2
3 , 4
5 , 6
7 , 8
9 , 10

In case this does not display correctly, the format is - number space comma space number

Using the following basic gawk we are able to return data as follows:

$ awk '{print "|"$0"|"}' RS='[,\n]' file
|1 |
| 2|
|3 |
| 4|
|5 |
| 6|
|7 |
| 8|
|9 |
| 10|

Pipes included to simply show white space.

If we then use getline prior to our print we receive:

$ awk '{getline;print "|"$0"|"}' RS='[,\n]' file
| 2|
| 4|
| 6|
| 8|
| 10|

Which again is all fine, however, if we then extend the RS computed regex to allow for spaces, our original output is the same but minus the spaces:

$ awk '{print "|"$0"|"}' RS='[,\n ]+' file
|1|
|2|
|3|
|4|
|5|
|6|
|7|
|8|
|9|
|10|

Again, as expected.  Once we go back to our getline version where we expect to return every second record, we now see our 'bug':

$ awk '{getline;print "|"$0"|"}' RS='[,\n ]+' f2
|2|
|4|
|6|
|8|
|9|   <-- This should have been |10|

The thread for further discussion on this issue can be found here :- http://www.linuxquestions.org/questions/programming-9/peculiar-awk-behaviour-confusing-me-4175503599/

Please advise if you should require any further information?

Cheers
Grail

reply via email to

[Prev in Thread] Current Thread [Next in Thread]