[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-recutils] Seekable parsers
From: |
Jose E. Marchesi |
Subject: |
Re: [bug-recutils] Seekable parsers |
Date: |
Thu, 24 May 2012 22:03:58 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.0.92 (gnu/linux) |
Hi.
(0) passing a file to the parser and seeking it from other code
(1) making a new parser for each record and passing it a memory buffer
starting at an appropriate point
(2) adding parser interfaces to seek it to another place in the file or
memory buffer.
I think the (0) solution is not elegant enough and could lead to bugs in
nonobvious cases. (1) might be slower than (2). I think (2) won't be
difficult to implement or test, so I'll implement it unless you
recommend a different solution.
(2) is definitely the way to go.
If we use mmap to read the recfile, parsers would need an additional
interface to not read past the end of the file, e.g. like this one:
/* Create a parser associated with a given buffer that will be used as
the source for the tokens. The buffer is of specified size and
doesn't have to be null-terminated. If not enough memory, return
NULL. */
rec_parser_t rec_parser_new_mem (const char *buffer, size_t size,
const char *source);
That is ok. But since the special case where SOURCE is NULL-terminated
can be easily handled using strlen (source) as the SIZE argument, I
would not introduce a new function. Just rename rec_parser_new_str into
rec_parser_new_mem.
But then, is using mmap the best option here? An alternative would be
to expand the fopen-based parser backend in order to use fseek/ftell.
The FILE* functions have less portability issues that mmap, and the
parser is character-oriented anyway.
Another problem is keeping line numbers correct when using any of the
above three solutions. This can be solved by adding another function to
set the line number, or (for (2)) to set it with the position in file.
This interface could be used for changing parser position in (2):
/* Change the position in file of the parser. The line number is only
used to store it in the parsed records. */
void rec_parser_seek (rec_parser_t parser, size_t line_number,
size_t position);
Yes, it is good to force the user to specify the new line number.
--
Jose E. Marchesi http://www.jemarch.net
GNU Project http://www.gnu.org