bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Read a fixed length of input each time


From: Peng Yu
Subject: Re: Read a fixed length of input each time
Date: Tue, 23 Jun 2020 12:16:32 -0500

I don't think it can be done with one-liner in python. I am not sure
about Perl, but even it can be done with a one-liner, its syntax can
be too complicated than necessary.

Even bin2uint() can be made in an extension, the fixed-width parsing
still needs to be supported by core gawk.

I think the gist of gawk is to treat input as stream and broken it
into records and fields. Any input that can be broken into records and
fields, in theory, could be processed by this model of gawk. But
records and fields may not need to be denoted by deliminators. They
can be deliminated by fixed length.

Thus, the text input restriction is not an essential limitation of
awk. It could be that awk may have two modes, one for text input and
the other for binary input. I don't think there would be any conflict
between the two modes.

On 6/23/20, Andrew J. Schorr <aschorr@telemetry-investments.com> wrote:
> On Tue, Jun 23, 2020 at 11:45:32AM -0500, Peng Yu wrote:
>> For example, I may do the revert processing to get back the integers
>> from the binary stream. For this kind of fixed binary stream, I feel
>> the awk syntax would be better than writing a much longer Go file.
>>
>> For example, if gawk would be able to split fields based on fixed
>> field width and a fixed number of NF. Then, I may simply write awk
>> code like the following (suppose that there is a bin2uint() function).
>> It would be must shorter than the equivalent Go code to convert such
>> binary stream to integers.
>>
>> $ awk -e '{ for(i=1;i<=NF;++i) { print bin2unit($i) }'
>
> To accomplish that, you'd need to write an extension library that allowed
> you
> to specify the number of bytes in a record and the number of bytes in each
> field. I'm not aware of any other way of achieving that, but it should be
> pretty easy using the extension API.  And then you'd have to write the
> bin2uint
> function, which I guess can be done using the ord() function in the ordchr
> extension. But it still seems like this is probably much easier in other
> languages such as Perl or Python (or perhaps Go; I'm less familiar with
> that one). It's best to use the right tool for a given task; AWK is
> designed for text manipulation, but other languages are better for
> handling binary data.
>
> Regards,
> Andy
>


-- 
Regards,
Peng



reply via email to

[Prev in Thread] Current Thread [Next in Thread]