[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[eLyXer-users] Re: eLyXer 0.98 released
From: |
Alex Fernandez |
Subject: |
[eLyXer-users] Re: eLyXer 0.98 released |
Date: |
Thu, 13 May 2010 23:20:36 +0200 |
Hi again,
On Thu, May 13, 2010 at 7:21 PM, Philiрp Rеichmuth
<address@hidden> wrote:
> Here's a question. Recently I've been writing some Python scripts to parse
> LyX files for keywords to generate indices (in a rather special scenario,
> looking for text formatted in some character styles only if it appears in
> column 1 or 3 of a table, that sort of thing). I used eLyXer's classes
> basically as a parser for LyX files, without using the conversion
> functionality at all. I find it quite useful because it gives me tree
> models of a LyX file. You can then write all sorts of tree traversal
> functions that do what you want, even something like XPath.
There are two reasons why this task should be relatively easy:
- First, the whole parsing structure is created using Python objects
called Containers. This makes manipulating them tree quite easy.
- Second, due to some concerns in the LyX dev list about format
instability, I refactored all parsing code for 0.18 into its own
src/parse package. All output code is also in Output classes, even if
it's more intermixed.
I have tried to document how eLyXer containers work on the dev guide:
http://www.nongnu.org/elyxer/devguide.html#toc-Subsection-1.2
The debugging function container.tree() might be useful as it shows
the whole structure at a glance.
> This isn't obviously what eLyXer was designed for and there were some
> glitches, such as having to patch extra values into ContainerConfig for
> parsing custom character style insets.
If you want to share your patches (so that I can integrate them and
release them with the next version, this section might be interesting:
http://www.nongnu.org/elyxer/devguide.html#toc-Section-3
> But eLyXer in its current state
> would be a good basis for a universal LyX file parser and document object
> model in Python; one could then refactor the HTML converter to work on this
> document object model. This would be useful for everybody processing LyX
> files with Python scripts, also if the file format eventually changes
> towards XML or whatever. It's obviously nothing for an 1.0 release, more
> like 2.0, but what do you think of it?
How would you improve on eLyXer's object model? I am open to
suggestions -- I have not worked harder on it because it fits my needs
right now, but if it can be useful to others I'm willing to change it
as needed.
Alex.