bug-recutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-recutils] Seekable parsers


From: Michał Masłowski
Subject: [bug-recutils] Seekable parsers
Date: Wed, 23 May 2012 11:09:47 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

Hello,

Using index files would require parsing only some records in a file,
this could be done in one of these ways:

(0) passing a file to the parser and seeking it from other code

(1) making a new parser for each record and passing it a memory buffer
    starting at an appropriate point

(2) adding parser interfaces to seek it to another place in the file or
    memory buffer.

I think the (0) solution is not elegant enough and could lead to bugs in
nonobvious cases.  (1) might be slower than (2).  I think (2) won't be
difficult to implement or test, so I'll implement it unless you
recommend a different solution.

If we use mmap to read the recfile, parsers would need an additional
interface to not read past the end of the file, e.g. like this one:

/* Create a parser associated with a given buffer that will be used as
   the source for the tokens.  The buffer is of specified size and
   doesn't have to be null-terminated.  If not enough memory, return
   NULL.  */

rec_parser_t rec_parser_new_mem (const char *buffer, size_t size,
                                 const char *source);

Another problem is keeping line numbers correct when using any of the
above three solutions.  This can be solved by adding another function to
set the line number, or (for (2)) to set it with the position in file.

This interface could be used for changing parser position in (2):

/* Change the position in file of the parser.  The line number is only
   used to store it in the parsed records. */

void rec_parser_seek (rec_parser_t parser, size_t line_number,
                      size_t position);

Attachment: pgptnS7KkiOkH.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]